Multiprocessing

Page content

Parallel Processing

i recently read an article about parallel processing. i remembered my domain checker service which checks a lot of domains for their availablitly, and this script runs sequentiel and needs around 30 seconds.

initially i worked on a caching mechanism to speed up results. but if a service is not used that often (nobody is useing my domain checker…), there is not much you can gain with caching.

so, i gave a try with Multiprocessing and i have to admit i’m prettyhappy with the result !

https://docs.python.org/3/library/multiprocessing.html

SampleCode Snippet

from multiprocessing import Pool
from pprint import pprint

import dns.resolver

timeout = 1
result = {}

def check_soa_record(domain):
    try:
        result = dns.resolver.resolve(domain, "SOA", lifetime=timeout)
        for rdata in result:
            return domain, True
    except Exception as e:
        return domain, False


def check_a_record(domain):
    try:
        result = dns.resolver.resolve(domain, "A", lifetime=timeout)
        for rdata in result:
            return domain, True
    except Exception as e:
        return domain, False


def check_aaaa_record(domain):
    try:
        result = dns.resolver.resolve(domain, "AAAA", lifetime=timeout)
        for rdata in result:
            return domain, True
    except Exception as e:
        return domain, False


if __name__ == "__main__":
    domains = [
        "google.com",
        "facebook.com",
        "twitter.com",
        "stackoverflow.com",
    ]

    # Create a Pool with 4 processes
    with Pool(4) as p:
        # Use the map() method to apply the process_domain function to each element in the domains list
        result["soa"] = p.map(check_soa_record, domains)
        result["a"] = p.map(check_a_record, domains)
        result["aaaa"] = p.map(check_aaaa_record, domains)

    pprint(result)

run it

time python test.py 
{'a': [('google.com', True),
       ('facebook.com', True),
       ('twitter.com', True),
       ('stackoverflow.com', True)],
 'aaaa': [('google.com', True),
          ('facebook.com', True),
          ('twitter.com', False),
          ('stackoverflow.com', False)],
 'soa': [('google.com', True),
         ('facebook.com', True),
         ('twitter.com', True),
         ('stackoverflow.com', True)]}

real	0m0.335s
user	0m0.201s
sys	0m0.074s

as you can see, 0.3 Seconds th check 4 Domains for existing A, AAAA & SOA Records … cool :)


Any Comments ?

sha256: 95bc325aa794a1e902a9ec3f98987a332595438ffa4a8092fd46fe54f1736ab8