+ Reply to Thread
Results 1 to 4 of 4
  1. #1
    Junior Member
    Join Date
    Aug 2010
    Posts
    11

    Default system overflow exception

    I have been working with bulk loading (appending) a large number (~3 million) of objects, each with approx. 30 double properties. To improve performance I am not committing the table until the end of the run. I notice the stsDb is consistently throwing a System.OverflowException, caught in FloatDeltaIndexerPersist.Store(BinaryWriter writer, IIndexer<double> values) method

    the catch sets the flag NativeWrite = true;
    and the system goes on its merry way.

    Is this normal behavior?
    Does it have a performance impact?
    If so, is there something I should be doing to eliminate/minimize the behavior?

    many thanks in advance

  2. #2
    Software Development Manager
    Join Date
    Feb 2010
    Posts
    262

    Default

    By default STSdb stores all it's data in compression mode. It uses a vertical compression for all public read\write properties of the stored type. For properties of numeric types (int, long, single, double, decimal etc.) it uses vertical delta compression.

    In short, if we have n values v1, v2,... vn the delta compression stores only the first value v1 and deltas between adjacent values. It's a simple lightweight compression. But for floating-point values there is some specific features.

    For effective store of floating-point values we store it as integral values, so we need to now the number of digits after decimal point for each value. And the question is: how to get it? And get it fast. The only way that we found in .NET is to cast the Double and Float value to Decimal one, and based on the internal Decimal type representation, to get the number of digits:

        const int SIGN_MASK = ~Int32.MinValue;
    
        public static int GetDigits(Decimal value)
        {
            return (Decimal.GetBits(value)[3] & SIGN_MASK) >> 16;
        }
    Tricky but fast. Then for Float and Double types we have:

        public static int GetDigits(double value)
        {
            decimal val = (decimal)value;
            double tmp = (double)val;
            if (tmp != value)
                return -1;
    
            return GetDigits(val);
        }
    
        public static int GetDigits(float value)
        {
            decimal val = (decimal)value;
            float tmp = (float)val;
            if (tmp != value)
                return -1;
    
            return GetDigits(val);
        }
    So, if we can cast Double/Float to Decimal we can get the digits fast and can compress the values; if we can't - we have to write it natively without compression.

    Back on your quiestions:

    Is this normal behavior?
    Yes, it is (it's not a bug).

    Does it have a performance impact?
    The short answer is no. In general, no.

    Usually Double and Float values are suitable for cast to Decimal. In this case there is zero performance penalty. In the other case, when we meet value that we can't cast, we store all values for the current block in raw mode. Of course, there is logic to predict this case. For example, if we add random Double values to the database (generated with Random.NextDouble()), the engine detects that number of digits for these values are greater than 15 and it knows that it can't cast before cast fails (with System.OverflowException). So even in this case the total number of failed compress operations is minimized.

    (Maybe in the future releases we will try to findout even a better way to get number of digits for Double and Float types, based on internal IEEE 754 representation, avoiding cast to Decimal.)

    If so, is there something I should be doing to eliminate/minimize the behavior?
    As we are saying above, practically there is no performance impact. But if you want really the fastest way to store you data, you should write your own persist class and store records without any compression. The implementation is simple.

  3. #3
    Software Development Manager
    Join Date
    Feb 2010
    Posts
    262

    Default Writing a custom persist layer

    Example below shows how to make a custom persist layer for record of type Tick:

        public class Tick
        {
            public DateTime Timestamp { get; set; }
            public double Bid { get; set; }
            public double Ask { get; set; }
    
            public Tick()
            {
            }
        }
    To write a custom persist layer we have to make a serializable class that implements IBinaryPersist<IIndexer<Tick>> interface:

        [Serializable]
        public class TickBinaryPersist : IBinaryPersist<IIndexer<Tick>>
        {
            #region IBinaryPersist<IIndexer<Tick>> Members
    
            public void Store(BinaryWriter writer, IIndexer<Tick> data)
            {
            }
    
            public void Load(BinaryReader reader, ref IIndexer<Tick> data)
            {
            }
    
            #endregion
        }
    and then write a simple implementation:

        [Serializable]
        public class TickBinaryPersist : IBinaryPersist<IIndexer<Tick>>
        {
            #region IBinaryPersist<IIndexer<Tick>> Members
    
            public void Store(BinaryWriter writer, IIndexer<Tick> data)
            {
                for (int i = 0; i < data.Count; i++)
                {
                    Tick record = data[i];
    
                    writer.Write(record.Timestamp.Ticks);
                    writer.Write(record.Bid);
                    writer.Write(record.Ask);
                }
            }
    
            public void Load(BinaryReader reader, ref IIndexer<Tick> data)
            {
                for (int i = 0; i < data.Count; i++)
                {
                    Tick record = new Tick();
    
                    record.Timestamp = new DateTime(reader.ReadInt64());
                    record.Bid = reader.ReadDouble();
                    record.Ask = reader.ReadDouble();
    
                    data[i] = record;
                }
            }
    
            #endregion
        }
    That's it. The engine will do the rest of work. It groups the records by 1024 (XTable.BlockCapacity) and passes them as IIndexer<Tick> data parameter. All we have to do is to write the records to BinaryWriter.

    The reading process is analogous.

    After the writing of the implementation we must tell the relevant table to use the new persist by setting RecordPersis property:

        using (StorageEngine stsdb = StorageEngine.FromFile("test.stsdb"))
        {
            var table = stsdb.Scheme.CreateOrOpenXTable<ulong, Tick>(new Locator("table"));
            table.RecordPersist = new TickBinaryPersist();
            stsdb.Scheme.Commit();
    
            ...
        }

  4. #4
    Junior Member
    Join Date
    Aug 2010
    Posts
    11

    Default

    Great information; thank you. I might experiment with a custom persist; thank you for the example as well.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts