crx_unpack Package

The purpose of this module is to mimic how Google Chrome unpacks CRX files as closely as possible. Involved in this is the need to remove the CRX headers (see the structure details of CRXs on the Home page), separate the underlying ZIP file, extract the contents of the ZIP file, among other things.

For end users, the only function you should need to call is unpack, which will handle each of the steps mentioned above.

crx_unpack.unpack(crx_file, ext_dir=None, overwrite_if_exists=False, img_tallies=None, test_contents=True, passwd=None, skip_img_formats=None, unpack_in_subprocess=False, convert_in_subprocess=True, do_convert=False)[source]

Unpack the CRX and extract it in the directory at ext_dir.

Return the absolute, normalized path to the extraction directory (useful if it wasn’t given as a parameter).

Parameters:
  • crx_file (str) – Path to the CRX file.
  • ext_dir (str) – Directory where to extract the contents.
  • overwrite_if_exists (bool) – When extracting to a directory that already exists, unpack will normally fail. Setting this to True will delete the contents of the destination directory before unzipping.
  • img_tallies (dict) – A dictionary for storing the number of each type of image file converted during the unpacking process.
  • test_contents (bool) – When unpacking the CRX, use the zipfile module’s test feature to test the validity of the embedded zip file before extraction.
  • passwd (str) – Optional password to use when extracting the CRX. If the CRX was obtained from Google’s Chrome Web Store, you should not need this. If you provide a password here, it will be passed on to the extract_zip function.
  • skip_img_formats (list or tuple) – The image formats to skip when attempting to convert them to PNG. This will typically include the strings ICO, PNG, and WEBP.
  • unpack_in_subprocess (bool) – Flag indicating if the job of unpacking the CRX should be done in a subprocess rather than calling the function directly. Usually this shouldn’t need to be set as it will only hinder performance.
  • convert_in_subprocess (bool) – Flag indicating if the job of converting the images in the CRX should be done in a subprocess rather than calling the function directly. Usually this SHOULD be set, since converting images can sometimes cause a segmentation fault, which kills the whole process.
  • do_convert (bool) – Flag indicating whether images should be converted during the unpacking process (intended to mimic Chrome’s unpacking process more closely).
Returns:

Directory where the archive was extracted.

Return type:

str

crx_unpack.extract_zip(zip_file, extract_dir, pwd=None, test_contents=True, reraise_errors=True)[source]

Simple wrapper around the Python zipfile.ZipFile class.

Typically, it is not necessary to call this function directly from anywhere other than the unpack function.

Parameters:
  • zip_file (str) – Path to the zip file to be extracted.
  • extract_dir (str) – Directory where the contents will be extracted.
  • pwd (str) – Password for the zip file.
  • test_contents (bool) – Whether to use the library’s testzip() function on the archive before extracting. Tests if the CRC and header of each file in the archive are valid.
  • reraise_errors (bool) – Set to False when the unpack script is run with the xo (extract only) command, in which case the function will return a non-zero value when an error occurs. The default, False, indicates that any errors that come up should just be re-raised.
Return type:

None

exception crx_unpack.BadCrxHeader[source]

Bases: Exception

Raised when a CRX’s header length or values aren’t valid.