python multiprocessing with multiple arguments -


i'm trying multiprocess function multiple actions large file i'm getting knownle pickling error eventhough im using partial.

the function looks this:

def process(r,intermediate_file,record_dict,record_id):      res=0      record_str = str(record_dict[record_id]).upper()     start = record_str[0:100]     end= record_str[len(record_seq)-100:len(record_seq)]      print sample, record_id     if r=="1":          if something:             res = something...             intermediate_file.write("...")          if something:             res =             intermediate_file.write("...")        if r == "2":         if something:             res = something...             intermediate_file.write("...")          if something:             res =             intermediate_file.write("...")      return res 

the way im calling following in function:

def call_func():     intermediate_file = open("inter.txt","w")     record_dict = get_record_dict()                 ### infos each record dict based on record_id     results_dict = {}       pool = pool(10)     in ["a","b","c",...]:          if not results_dict.has_key(a):             results_dict[a] = {}          b in ["1","2","3",...]:              if not results_dict[a].has_key(b):                 results_dict[a][b] = {}               results_dict[a][b]['res'] = []              infile = open(a+b+".txt","r")             ...parse file , return values in list called "record_ids"...              ### call function based on each record_id in record_ids             if b=="1":                 func = partial(process,"1",intermediate_file,record_dict)                 res=pool.map(func, record_ids)                 ## append results each pair (a,b) each record in results_dict                  results_dict[a][b]['res'].append(res)              if b=="2":                 func = partial(process,"2",intermediate_file,record_dict)                 res = pool.map(func, record_ids)                 ## append results each pair (a,b) each record in results_dict                 results_dict[a][b]['res'].append(res)       ... results_dict... 

the idea each record inside record_ids, want save results each pair (a,b).

i'm not sure giving me error:

  file "/code/python/python-2.7.9/lib/multiprocessing/pool.py", line 251, in map     return self.map_async(func, iterable, chunksize).get()   file "/code/python/python-2.7.9/lib/multiprocessing/pool.py", line 558, in     raise self._value cpickle.picklingerror: can't pickle <type 'function'>: attribute lookup __builtin__.function faile 

d

func not defined @ top level of code can't pickled. can use pathos.multiprocesssing not standard module work.

or, use diferent pool.map maybe queue of workers ? https://docs.python.org/2/library/queue.html

in end there example can use, it's threading similar multiprocessing there queues...

https://docs.python.org/2/library/multiprocessing.html#pipes-and-queues


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -