SOAMC= ENDGAME: Extraction and Decrunching progress
in All about SOAMC= | Thursday, May 10, 2018 | 10:01

The following list shows all the files we are identifying, extracting and decrunch (with XFDecrunch) PRIOR to actual module/music scanning. They are divided into groups. This process was started in Apil 2018 and will take several months.

Update 15 November 2018: This post will not be updated anymore and acts as of this date just part of the history:-)

Update 07 Nov 2018: Final phase of the extraction and decrunching is expected to be finalized early November 2018. No more updates will be done to this page.

Update 01 Oct 2018: Added final list of XFD slaves and XPK libs. Links at bottom, a gold mine for all amiga fans hopefully.

Update 23 Sep 2018: Removed references and downloads to XFD Slaves and XPK libraries for now (will be re-released soon). Updated my SOTDS database which has now over 31 million entries instead of previous 28 million. It is now used as a base to identify, extract and process files.

Update 4 Aug 2018: Current files tagged for music format scanning: 25455631 (most likely we are around 26 million files!)

Update 4 Aug 2018: We have entered the final stage of extracting out files that are to be scanned for musical signatures (see my other post for details). We have about 20million entries to deal with then everything will be decrunched, so this is gonna take some more weeks or months :-

The origin of files located in my Amiga collection is from my most recent compiled SOTDS database that has not not been released yet online in SOTDS format, as the software "SOTDS_SEARCHER.EXE" and "SOTDS_CONSTRUCTOR.EXE" is currently in development.

Since my .DAT files are already in text format, I'm currently using other tools to filter out file signatures for ADF, ADZ, DMS, LHA etc.

The current statistics of what we are processing:
Rows: 31133159 (31.13mill)
Total Words: 217932113 (217.93mill)
Locations: 883872 (0.88mill)
Original Database: 3061080989 bytes (3061.08 MB)

Rows = actual single files. Used in our scanning.
Total Words = Only applicable for SOTDS Searcher (how many words that are unique). Not used in the scanning procedure.
Locations = Sources that are disk-images, archives, cd-images as listed below and used in our scanning.

Now on with the progress and filters:

For Executable files identified by 000003f300000000 - first 8 bytes
All scanning and processing DONE!

We have selected the most common and easily known headers used for typically datafiles as they do not start with the executable header. Naturally, we cannot list all kinds of crunchers here in this section as it would take months to try and find every compresser ever used on Amiga Files, sort and tag them like this. We have simply chosen these to try and cut down the processing at least a little bit :-)

As a bonus report we have stored every cruncher detected on executables, see the bottom link at this page.

For PowerPacker (PP20) datafiles identified by 50503230090a - first 6 bytes
All scanning and decrunching DONE!

For XPK (XPKF) Crunched datafiles identified by 58504b46 - first 4 bytes
All scanning and decrunching DONE!

For CrunchMania (CrM) Crunched datafiles identified by 43724d - first 3 bytes
All scanning and decrunching DONE!

For StoneCracker (S40) Crunched datafiles identified by 533430 - first 3 bytes
All scanning and decrunching DONE!

For single files located out/inside disk-images and archives for all variants above
All scanning and decrunching DONE!

For all the rest
Scanning and decrunching for ALL files located inside and outside of disk-images, archives & CD-images that DONT match signatures above - but they will be decrunched as well (that would be like regular data files / compressed data files or plain music files as they are in original format, but include image files, info files, sample files, text files etc. etc., meaning a lot of junk but we will let nothing escape our scan, the only limit we have is more than 200 bytes).

Files detected: 20510380 (20.5 million) - Extraction finalized during early November 2018.

After all of this has been processed as listed above, we can start on the actual music/module indentifying and ripping through all decrunched executables and non-executables - which will take additional months for a multitude of WinUAE instances :-)

XFD Slaves and XPK Library List
We have mention several places we use XFDDecrunch to decrunch all extracted files (that means from Disk-Images, Archives, Plain Files, CD-images etc.). Since the decrunching started in early May 2018, the following list of "xfd/" and ".library" (connects via xpkmaster.library and used too by XFDDecrunch where needed) are used in our decrunching environment.

I have located 300+ pcs of XFD slaves and xxxx pcs of XPKxxxx.LIBRARY during my side-project-hunting! Want them? Simply download the entire pack below! Note: I did a MD5 checksum on these, along with filesize and version checking, so what you are looking at is the most complete list of XFD slaves ever to surface or be listed anywhere and most likely the newest/last version that was released!

SOTDS database system was used to locate these quickly, then other custom tools to extract ALL files that matched and then crosscheck to prevent dupes but to keep unique variants only.

See link below.

XPK Libraries:
See link below.

