I need to run this command using python subprocess module (notice I'm using only variables). If shell=True, the executable argument specifies which shell to use. On Unix, the default shell is /bin/sh. On Windows, the default shell is specified by the COMSPEC environment variable. The only reason you would need to specify shell=True on.
Unlike Javascript, which is naturally asynchronous, Python interpreter executes codes in a sequential order. The subsequent jobs have to wait until the completeness of the previous ones. This behavior sometimes has disadvantages, especially when the processing time for certain jobs in the queue is fairly long and the subsequent jobs do not really rely on the results of previous ones. This raises a question: Could we take advantage of the waiting time in an efficient way and possibly parallelize the workflow of processing of the sequential jobs.
![Install Install](http://docs.activestate.com/activepython/3.6/dip3/i/mac-install-2-information.png)
Well, the answer is of course yes. Let me give you an example of a problem I often came across in my work. Everyday I have to run scripts to process satellite data. The scripts will first read raw data files, each file corresponding to one frame (a frame corresponds to a specific area on earth at a specific time). The scripts read data and process them sequentially, one at a time. This is highly inefficient and takes a lot of time for the whole processing to be done. I was then thinking if there is a way to make the processing of the different files parallel, which means they are all processed concurrently. And Python
subprocess
module came to the rescue.Subprocess module
According to the official documentation, the
subprocess
module allows you to spawn new processes, connect to their input, output, and error pipes, and obtain their return codes.Start a process in Python
One common scenario you might want to use
subprocess
is when you want to start a new application (say, another python script, or a shell script) from your Python program. Let’s go through an example and I’ll show you how to use subprocess
module.For example, you are writing a
process_data.py
program which performs some data processing work. However, the raw data files are compressed (e.g. they are some bz2 files) in order to reduce space storage. The compressed files need to be uncompressed first. This uncompressing task is performed by a bunzip2_file.py
program. Normally, you don’t want to bother to run bunzip2_file.py
command in the terminal first and then run process_data.py
. It will be more convenient to just run process_data.py
. And within this program, it will spawn a subprocess, invoking bunzip2_file.py
. Let’s take a look at the codes:First we import
subprocess
functions Popen
and PIPE
. Popen
is the class which handles process creation and management in the module. It could take a series of arguments. You may refer to the official subprocess documentation for the full Popen
arguments list. Most of the time you only need to pass arguments cmd_list, stdout and stderr, and sometimes shell boolean value.The cmd_list should be a sequence of program arguments or else a single string, just the same as you type in the terminal. The shell argument specifies whether to use the shell as the program to execute. If shell is true, it is recommended to to pass cmd_list as a string. stdout and stderr specify the executed program’s standard output and standard error file handles. Most of the time we’ll pass
PIPE
as the value to stdout and stderr, which means a new pipe to the child should be created. By specifying it this way, it means you are going to redirect the standard output and error to a new pipe, which normally defaults to the terminal. You can also give other values to stdout or stderr such as an existing file object, if you want to redirect the output to a file. Then you read the stdout and stderr and print it in the terminal accordingly.Parallel processing
Now, suppose you have multiple compressed files, if you don’t use any parallel processing, the program will uncompress the file one by one, until the last one is uncompressed. This is time-consuming and inefficient. If we use parallel processing, however, the processing time will be 1/N of the sequential processing, where N is the number of files to be uncompressed.
We can use
subprocess
module to create multiple child processes and they are run in parallel.First, we search the current directory and obtain a list of all the compressed files. Next, we create a list of the sequence of program arguments, each list element corresponding to each file. Then, we create a process list using
Popen
for each command. Finally, we wait for the child processes to terminate. Plain and simple.Conclusion
Python
subprocess
module is useful for starting new processes in Python and running them in parallel. Parallel processing could substantially reduce the processing time. Careful readers might notice that subprocess can be used if we want to call external programs in parallel, but what if we want to execute functions in parallel. The answer is to use another python module multiprocessing
. I’ll write another article covering multiprocessing
.Written by Shuzhan Fan on Dec 17, 2017
RELATED ARTICLES
Я хочу написать сценарий для автоматической настройки новой установки ubuntu и установки приложения на основе django. Поскольку скрипт будет запущен на новом сервере, скрипт Python должен автоматически установить некоторые требуемые модули.
Вот сценарий.
Установка
pexpect
является успешной, однако следующий import pexpect
не удался. Я думаю, это потому, что во время выполнения код не знает о недавно установленном pexpect
. Как установить и импортировать модули Python во время выполнения? Я открыт для других подходов.
Вы можете импортировать pip вместо использования подпроцесса:
Другое дело:
[редактировать]
Вы должны рассмотреть возможность использования требования .txt вместе с pip. Похоже, вы пытаетесь автоматизировать развертывание (и это хорошо!), В моем поясе для инструмента у меня также есть виртуальный виброповерт, бродячий и незаменимый .
Это результат для меня:
Я решил проблему с помощью модуля
imp
. ![Subprocess module python Subprocess module python](/uploads/1/2/5/6/125627733/711334436.png)
На самом деле код менее идеален, так как мне нужно жестко установить каталог модуля Python. Но поскольку сценарий предназначен для известной целевой системы, я думаю, что все в порядке.
Я на самом деле сделал модуль для этой точной цели ( установки )
Это очень просто:
Ссылки Github для вопросов / вкладов