Usage example with compression switches

Apr 5, 2009 at 5:28 PM
Hello Eveyone!

It will be very helpful if usage example with compression switches is provided.

Specifically for something equivalent to -mm=Deflate -mfb=258 compression switches in 7za command line.
Coordinator
Apr 5, 2009 at 11:02 PM
Unless you use LZMA SDK code, it is impossible to specify the compression rate. And it is likely to stay impossible in the future.
Apr 6, 2009 at 4:13 AM
Thanks, for information.

But, how do we use LZMA SDK to specify compression rate? Isn't it only for 7z format?
Coordinator
Apr 6, 2009 at 7:44 AM
Yes, it is. SevenZipCompressor.CompressStream/Bytes routines use LZMA SDK and they are only for 7-zip (strictly speaking, lzma). To change their compression rate at the moment, one has to modify the code (set dictionary size). I haven't paid much attention to it, I will implement compression rates in those functions in the next release.

As for zip, 7z.dll or something does not expose COM functions to change the compression rate in IOutArchive interface, so it will take really a lot of time to work it out.
Apr 6, 2009 at 5:23 PM
The approach with CompressBytes/ExtractBytes worked fine:

        public byte[] SerializeDataSet(DataSet dataSet)
        {
            dataSet.RemotingFormat = SerializationFormat.Binary;
            using (MemoryStream input = new MemoryStream())
            {
                BinaryFormatter bf = new BinaryFormatter();
                bf.Serialize(input, dataSet);
                byte[] retval = SevenZip.SevenZipCompressor.CompressBytes(input.ToArray());
                return retval;
            }
        }

        public DataSet DeserializeDataSet(byte[] data)
        {
            byte[] buffer = SevenZip.SevenZipExtractor.ExtractBytes(data);
                
            using (MemoryStream mstream = new MemoryStream(buffer))
            {
                BinaryFormatter bf = new BinaryFormatter();
                return (DataSet)bf.Deserialize(mstream);
            }
        }
 
but couldn't make the other ones work. For example:

        public byte[] SerializeDataSet(DataSet dataSet)
        {
            BinaryFormatter bf = new BinaryFormatter();

            dataSet.RemotingFormat = SerializationFormat.Binary;
            using (MemoryStream input = new MemoryStream(), output = new MemoryStream())
            {
                bf.Serialize(input, dataSet);

                SevenZip.SevenZipCompressor.CompressStream(input, output, null, null);
                output.Flush();
                return output.ToArray();
            }
        }

        public DataSet DeserializeDataSet(byte[] data)
        {
            using (MemoryStream input = new MemoryStream(data), output = new MemoryStream())
            {
                SevenZip.SevenZipExtractor.DecompressStream(input, output, null, null);
                BinaryFormatter bf = new BinaryFormatter();
                return (DataSet)bf.Deserialize(output);
            }
        }

The CompressStream returns only 18 bytes....

Regards,

Roberto



Coordinator
Apr 6, 2009 at 7:11 PM
Roberto, thanks for your post, I will work it out.
Coordinator
Apr 6, 2009 at 7:11 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Apr 6, 2009 at 7:41 PM
Fixed
Apr 8, 2009 at 7:02 AM
After, going through the 7Zip source I found following information which may be (please verify) of help in this regard.

In CPP\7Zip\Archive\IArchive.h interface ISetProperties is exposed as,

ARCHIVE_INTERFACE(ISetProperties, 0x03)
{
  STDMETHOD(SetProperties)(const wchar_t **names, const PROPVARIANT *values, Int32 numProperties) PURE;
};

The SetProperies function is defined in various archive type handlers, i.e. in <archive_type>HandlerOut.cpp in CPP\7Zip\Archive\<archive_type> directory where archive_type can be 7z, BZip2, GZip, or Zip.

I have looked into the SetProperties function in ZipHandlerOut.cpp, it seems that, parameter names are same as they are used in 7Zip command line version.
Coordinator
Apr 8, 2009 at 12:06 PM
Interesting, but the question is, how we are able to invoke SetProperties on our IOutArchive object with only two functions declared inside (Update and GetFileTimeType).
Apr 9, 2009 at 6:34 AM

Please have a look at the CAgent::DoOperation method in the file CPP\7Zip\UI\Agent\AgentOut.cpp file in the 7Zip source.

Following code is at the end of it.

CMyComPtr<ISetProperties> setProperties;
  if (outArchive->QueryInterface(IID_ISetProperties, (void **)&setProperties) == S_OK)
  {
    if (m_PropNames.Size() == 0)
    {
      RINOK(setProperties->SetProperties(0, 0, 0));
    }
    else
    {
      CRecordVector<const wchar_t *> names;
      for(i = 0; i < m_PropNames.Size(); i++)
        names.Add((const wchar_t *)m_PropNames[i]);

      NWindows::NCOM::CPropVariant *propValues = new NWindows::NCOM::CPropVariant[m_PropValues.Size()];
      try
      {
        for (int i = 0; i < m_PropValues.Size(); i++)
          propValues[i] = m_PropValues[i];
        RINOK(setProperties->SetProperties(&names.Front(), propValues, names.Size()));
      }
      catch(...)
      {
        delete []propValues;
        return E_FAIL;
      }
      delete []propValues;
    }
  }
  m_PropNames.Clear();
  m_PropValues.Clear();

  if (sfxModule != NULL)
  {
    CInFileStream *sfxStreamSpec = new CInFileStream;
    CMyComPtr<IInStream> sfxStream(sfxStreamSpec);
    if (!sfxStreamSpec->Open(sfxModule))
      return E_FAIL;
      // throw "Can't open sfx module";
    RINOK(CopyBlock(sfxStream, outStream));
  }

  RINOK(outArchive->UpdateItems(outStream, updatePairs2.Size(),updateCallback));
  return outStreamSpec->Close();
}

So basically it is invoking QueryInterface on outArchive object. In .NET one can use Marshal.QueryInterface Method of .NET framework.

Can you pl verify it? If it is possible, it will be a great feature to add in SevenZipSharp.

Coordinator
Apr 9, 2009 at 8:14 AM
bbhar, thank you very much, this makes the difference!
If everything goes ok, we will be able to make even sfx i guess.
Coordinator
Apr 11, 2009 at 8:53 PM
bbhar, I am very pleased to claim that I have implemented compression levels and compression methods!
Code example:

            SevenZipCompressor tmp = new SevenZipCompressor();
            tmp.ArchiveFormat = OutArchiveFormat.Zip;
            tmp.CompressionLevel = CompressionLevel.High;
            tmp.CompressionMethod = CompressionMethod.Deflate64;

I will commit the code to SVN at night.

Thank you again.

P.S. working on sfx support!
Coordinator
Apr 11, 2009 at 10:33 PM
bbhar, please test the 0.41 release :)
Apr 15, 2009 at 6:24 AM

markhor, Thanks for the feature implementation. I tested your code. At preliminary test, it seems to be working ok. I have suggestion for a couple of new features though.

  1. Add one CompressionLevel called CompressionLevel.Custom and one method called SetCustomCompressionLevel where one can set custom compression method parameters like fb={NumFastBytes}, pass={NumPasses} and other parameters as mentioned in 7-Zip command line help document.
  2. 7-Zip can auto sense archive and extract files from it. For example, it can open .svgz, which is basically gzipped .svg file. SevenZipSharp can only handle extensions mentioned in InExtensionFormats dictionary. Adding autosensing archive type at the time of extraction will certainly help in this situation.
Coordinator
Apr 15, 2009 at 8:06 AM
The first feature can be implemented easily, but I am not so sure for the second. Thanks for your suggestions, I will see what I can do.
Coordinator
Apr 16, 2009 at 12:10 PM
Implemented (1). Check SVN. You may also test new SFX feature.
Apr 16, 2009 at 1:12 PM

Many thanks. About the second feature requested by me, I found the follwing function in the file CPP\7Zip\UI\Common\OpenArchive.cpp might have the code for auto detecting archive type by verifying signature of different archive types/codecs.

HRESULT OpenArchive(
    CCodecs *codecs,
    int arcTypeIndex,
    IInStream *inStream,
    const UString &fileName,
    IInArchive **archiveResult,
    int &formatIndex,
    UString &defaultItemName,
    IArchiveOpenCallback *openArchiveCallback)

I found that following piece of code in the above function may be of interest.

    const Byte *buf = byteBuffer;
    Byte hash[1 << 16];
    memset(hash, 0xFF, 1 << 16);
    Byte prevs[256];
    if (orderIndices.Size() > 255)
      return S_FALSE;
    int i;
    for (i = 0; i < orderIndices.Size(); i++)
    {
      const CArcInfoEx &ai = codecs->Formats[orderIndices[i]];
      const CByteBuffer &sig = ai.StartSignature;
      if (sig.GetCapacity() < 2)
        continue;
      UInt32 v = sig[0] | ((UInt32)sig[1] << 8);
      prevs[i] = hash[v];
      hash[v] = (Byte)i;
    }

    processedSize--;
    for (UInt32 pos = 0; pos < processedSize; pos++)
    {
      for (; pos < processedSize && hash[buf[pos] | ((UInt32)buf[pos + 1] << 8)] == 0xFF; pos++);
      if (pos == processedSize)
        break;
      UInt32 v = buf[pos] | ((UInt32)buf[pos + 1] << 8);
      Byte *ptr = &hash[v];
      int i = *ptr;
      do
      {
        int index = orderIndices[i];
        const CArcInfoEx &ai = codecs->Formats[index];
        const CByteBuffer &sig = ai.StartSignature;
        if (sig.GetCapacity() != 0 && pos + sig.GetCapacity() <= processedSize + 1)
          if (TestSignature(buf + pos, sig, sig.GetCapacity()))
          {
            orderIndices2.Add(index);
            orderIndices[i] = 0xFF;
            *ptr = prevs[i];
          }
        ptr = &prevs[i];
        i = *ptr;
      }
      while (i != 0xFF);
    }

 

So, it seems that archive signature informations are available in their corresponding codecs.

Apr 21, 2009 at 6:22 AM

I found an alternate way to determine if a file is a compressed archive and to find archive type. You can refer to following  links.

File Signatures Table
URL: http://www.garykessler.net/library/file_sigs.html

How to check if a file is compressed in C#
URL: http://blog.somecreativity.com/2008/04/08/how-to-check-if-a-file-is-compressed-in-c/

Coordinator
Apr 22, 2009 at 10:35 PM
bbhar, thank you very much, your solution is simple and handy, I am going to use it. I will store not file extensions but file signatures.
Coordinator
Apr 23, 2009 at 7:42 PM
I need signatures for UDF, MUB, HFS and DMG. Can't google them, damn.
Coordinator
Apr 23, 2009 at 8:37 PM
Apart from those formats, I implemented signatures. If the signature recognition fails, the usual file extension one is called (applicable to tar archives created by 7-zip - no signature is written).