crompress is very slow

Jun 11, 2009 at 9:38 AM
Edited Jun 11, 2009 at 9:40 AM

I tested 7z in my C#-project and it works. But it is very slow. If I compare 2 examples (1. with shellexecute and the 2. with SevenZipSharp.

 

1. Shellexecute:

~ 30sec

            vshellexecute = DateTime.Now;
            ShellExecute("7z", "a -r -mx=9 \"H:\\Backup.7z \" \"E:\\FilesToCompress\"");
            Console.WriteLine(DateTime.Now.Subtract(vshellexecute));

2. SevenZipSharp:

~1:40min

            vassembly = DateTime.Now;
            SevenZipCompressor tmp = new SevenZipCompressor();
            SevenZipCompressor.SetLibraryPath(@"C:\PathTo7z\7z.dll");
            tmp.ArchiveFormat = OutArchiveFormat.SevenZip;
            tmp.CompressionMethod = CompressionMethod.Default;
            tmp.CompressionLevel = CompressionLevel.Ultra;
            tmp.Compressing += new EventHandler<ProgressEventArgs>((s, e) =>
            {
                Console.Clear();
                Console.WriteLine(String.Format("{0}%", e.PercentDone));
            });
            tmp.CompressDirectory(
              @"E:\FilesToCompress\",
              @"H:\Backup.7z");
            Console.WriteLine(DateTime.Now.Subtract(vassembly));

The archive is 20MB, 1200 files in 60 folders, uncompressed 60mb

The resulting 7z is same size and both files seems to be correct.

Coordinator
Jun 12, 2009 at 12:09 PM
Edited Jun 12, 2009 at 7:44 PM

I have performed some tests.

Compressing program files of the Russian dictionary Lingvo x3 (mostly bin files, some are already compressed, 46 files, 37mb overall).

1)7z.exe test.

            DateTime vshellexecute = DateTime.Now;
            ProcessStartInfo si = new ProcessStartInfo();
            si.FileName = @"c:\Program Files\7-Zip\7z.exe";
            si.UseShellExecute = true;
            si.Arguments = "a -r -mx=9 \"d:\\Temp\\arch.7z\" \"c:\\Program Files\\ABBYY Lingvo x3\"";
            Process p = Process.Start(si);
            p.WaitForExit();
            Console.WriteLine(DateTime.Now.Subtract(vshellexecute));

First call: 00:02:05.9780000 (more than 2 minutes, "Scanning" was slow - excluded)

Second call: 00:00:18.5480000 (18.5 seconds).

Third call:  00:00:19.6940000 (19.5 seconds).

2)SevenZipSharp test (Debug).

            DateTime vsevenzipsharp = DateTime.Now;
            SevenZipCompressor tmp = new SevenZipCompressor(true);
            tmp.CompressionLevel = CompressionLevel.Ultra;
            tmp.CompressDirectory(@"c:\Program Files\ABBYY Lingvo x3\", @"D:\Temp\arch1.7z");
            Console.WriteLine(DateTime.Now.Subtract(vsevenzipsharp));
            vsevenzipsharp = DateTime.Now;
            tmp.CompressDirectory(@"c:\Program Files\ABBYY Lingvo x3\", @"D:\Temp\arch2.7z");
            Console.WriteLine(DateTime.Now.Subtract(vsevenzipsharp));

Output:

00:00:18.2530000 (18 seconds)

00:00:16.8440000 (17 seconds)

The results were as I expected. Nothing to add.

 

Jun 14, 2009 at 2:14 PM

Very strange... I repeated the test with the same result... sevenzipsharp needs 7 times more time as 7z with shellexecute. I started the program several times.

 

Some Informations about my Computer, maybe there is something thats explain that behavior:

Windows 7 RC1 64bit

Visual C# 2008 Express Edition

Intel Core i7 with 6gb RAM

7zip 4.65 64 Bit Edition

Coordinator
Jun 15, 2009 at 9:27 PM

Sorry, I forgot about that specific issue. If you compress lots of files (> 1000) SevenZipSharp tries to get the common root and sorts file names, so one experience poor perfomance. So generally speaking it compresses faster than ShellExecute but scans slower on large number of files. I thought I fixed it properly, but now it seems I should implement another approach of scanning for files (not so smart but much faster).

Coordinator
Jun 16, 2009 at 7:37 PM

Well, worked it out :) Now CompressDirectory is rather fast and faster than original 7-zip. Check the 0.50 release.

Aug 12, 2009 at 12:41 PM

I still expirience the same problem. CompressDirectory performs very slow on a directory with large amount of small files (~5-10K) files.

Command line seven zip tool processes this folder in less then a minute while a simple test program with SevenZipSharp.dll V 0.56 linked and single method call (CompressDirectory) takse much longer - 10-15 minutes. Do you have any ideas?

Thanks

Coordinator
Sep 1, 2009 at 11:32 PM

I performed additional tests compressing the Visual Studio 2008 folder (370 mb, 4013 files). SevenZipSharp took 4:26, while 7-zip took 3:00. When I hard coded all event notifications off, the time became 3:45. I decided to make special FastCompression property of Compressor to switch off any events but compress faster.

 

Mar 7, 2010 at 11:09 PM

I've also run into a slow compression with SevenZipSharp.  Here's the results of a bit of testing with your sample application (on a directory containing 22mb in 643 files):

Compression / Time Taken

None / 2 seconds

Fast / 7 seconds

Low / 9 seconds

Normal / 46 seconds (versus only 10 seconds with the 7-Zip release)

High / 72 seconds

Ultra / 69 seconds

Any idea why Normal seems to be quite fast with the 7-Zip release, but then 4.6x slower with SevenZipSharp?  For now I've just switched to using Low as a work-around, but it'd be nice if Normal worked well too.

Thanks

Mar 8, 2010 at 9:44 AM

I had a similar problem, slow compression via code but no such problems with the shell. The problem disappeared when I upgraded my 7-zip installation to the most recent version (I had a version from 2006 installed on my machine), could be worth a try?

Mar 8, 2010 at 10:48 AM

I just installed 7-zip yesterday, so it's the latest 4.65.

Sep 17, 2010 at 9:39 PM
Edited Sep 17, 2010 at 11:55 PM

@markhor  we are using the lates .64 version against 9.x beta release.   The FastCompression flag doesn't really help at all on CompressDirectory when it contains 1000's of files (all small XML files) it is still about ~6x slower than using 7z directly.  It seems SevenZipSharp is great for small number of files, but in large numbers it is not useful, it is faster to just shell out and use the command line version.

Is this some form of COM overhead?  I would think as the SevenZipSharp is a wrapper so it just has to pass off the call to the DLL once and then the 7z.dll is doing all the work, what could be causing this to be this slow?

Data:
10,863 files in 13 folders totaling 320MB 

Settings same between 7z native and SevenZipSharp
LZMA, Ultra, AES-256, EncryptHeaders = True, DirectoryStructure = true, FastCompression = true

SevenZipSharp took 6 minutes and 8 seconds

Native 7z using GUI took 1 minute and 17 seconds.

Output was identical between the two otherwise.

 

Update:

Looking through the SevenZipCompressor source I noticed that when calling the CompressDirectory that it converts the directory into a List<string> of the files in the directory and then calls CompressFilesEncrypted.  This then converts that List<string> to a FileInfo[], before calling into the low-level wrapper.  

I do not know what the 7z.dll exposes, however it would seem that it may expose a way to just pass the directory provided like you can at the EXE or in the GUI, and then let the native code deal with all the iteration of the files.  Not even sure if this is any part of the impact.

I also noticed that the default Dictionary size in 7z is 64 MB and word size is 64, and solid block size is 4 GB, it also has an option of number of CPU threads with a default of 2 out of 8.

I change your default SevenZipCompressor.LzmaDictionarySize from 1 << 22 to 1 << 48 to set it to 64 MB.  I couldn't find where you are setting the word size and solid block size, but the dictionary size didn't impact performance at all.  

Is it possible to provide settings in the SevenZipCompressor to control the word and solid block size, and even if we can control the number of CPU threads it will use.  I know that you are limited to what is available in the 7z.dll, so I would understand if these options are not available for some reason.

Coordinator
Sep 28, 2010 at 7:31 AM

Hello.

I only have a minute to answer, so this is what I think:

LzmaDictionarySize impacts on the execution of the managed LZMA code from LZMA SDK.

Directory must be converted to FileInfo[] as 7-Zip asks wrapper for file sizes, their names, etc. I think the performance degrade os because of the construction of 1000+ FileInfo-s. To check this, use the special event in the SevenZipCompressor (something like ...FilesScanned or FilesFound, etc.). That event is called after all the files were converted to FileInfo-s.

Jan 16, 2013 at 5:27 PM
markhor wrote:
Directory must be converted to FileInfo[] as 7-Zip asks wrapper for file sizes, their names, etc. I think the performance degrade os because of the construction of 1000+ FileInfo-s. To check this, use the special event in the SevenZipCompressor (something like ...FilesScanned or FilesFound, etc.). That event is called after all the files were converted to FileInfo-s.

The event fires in the blink of an eyelid so that's not the cause of the problem.

In the test that I am using there are 2800+ files. It appears that once you have processed about half of them, the remainder are processed very quickly.

Any other suggestions?