python multiprocessing with multiple arguments -
i'm trying multiprocess function multiple actions large file i'm getting knownle pickling
error eventhough im using partial
.
the function looks this:
def process(r,intermediate_file,record_dict,record_id): res=0 record_str = str(record_dict[record_id]).upper() start = record_str[0:100] end= record_str[len(record_seq)-100:len(record_seq)] print sample, record_id if r=="1": if something: res = something... intermediate_file.write("...") if something: res = intermediate_file.write("...") if r == "2": if something: res = something... intermediate_file.write("...") if something: res = intermediate_file.write("...") return res
the way im calling following in function:
def call_func(): intermediate_file = open("inter.txt","w") record_dict = get_record_dict() ### infos each record dict based on record_id results_dict = {} pool = pool(10) in ["a","b","c",...]: if not results_dict.has_key(a): results_dict[a] = {} b in ["1","2","3",...]: if not results_dict[a].has_key(b): results_dict[a][b] = {} results_dict[a][b]['res'] = [] infile = open(a+b+".txt","r") ...parse file , return values in list called "record_ids"... ### call function based on each record_id in record_ids if b=="1": func = partial(process,"1",intermediate_file,record_dict) res=pool.map(func, record_ids) ## append results each pair (a,b) each record in results_dict results_dict[a][b]['res'].append(res) if b=="2": func = partial(process,"2",intermediate_file,record_dict) res = pool.map(func, record_ids) ## append results each pair (a,b) each record in results_dict results_dict[a][b]['res'].append(res) ... results_dict...
the idea each record inside record_ids, want save results each pair (a,b).
i'm not sure giving me error:
file "/code/python/python-2.7.9/lib/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() file "/code/python/python-2.7.9/lib/multiprocessing/pool.py", line 558, in raise self._value cpickle.picklingerror: can't pickle <type 'function'>: attribute lookup __builtin__.function faile
d
func
not defined @ top level of code can't pickled. can use pathos.multiprocesssing
not standard module work.
or, use diferent pool.map
maybe queue of workers ? https://docs.python.org/2/library/queue.html
in end there example can use, it's threading
similar multiprocessing
there queues...
https://docs.python.org/2/library/multiprocessing.html#pipes-and-queues
Comments
Post a Comment