Thursday, October 31, 2013

Wonderful Performance Metrics Tool

When my crawler project is closed to the end. my concern on the performance is heavier. How to measure my system performance? there is no easy answer for it. So many module, so many parameters, most important is all these CAN'T affect the system performance and increase the module complexity.  

Even for the JSON parsing function, I spend almost 1 working day to come out the performance measurement and it is just for the unit testing ;( . The result is just like the following. 


stockprice (36648000 records) elapsed ms:376806.531 for 36000 avg:9.937 variance:5.652 Fastest:9.000 Slowest:244.000
[67, 9 x 16910, 10 x 13029, 11 x 4306, 12 x 868, 73, 13 x 248, 14 x 123, 15 x 49, 17 x 15, 16 x 33, 19 x 9, 18 x 8, 21 x 26, 20 x 11, 23 x 51, 22 x 47, 25 x 43, 24 x 41, 27 x 39, 26 x 38, 29 x 24, 28 x 28, 31 x 3, 30 x 18, 34, 35 x 2, 32 x 2, 33 x 6, 38, 39 x 2, 36 x 6, 37 x 3, 42 x 2, 43, 41, 50, 48, 54, 244]


But how about other functions.... I was almost frightened by the future workload. It seems my system launch day need be postpone. Until today, I find this wonderful library, Metrics, through the netty example. It is fantastic and save me huge time on the performance measurement and reporting. 

With just few lines, the following result will be automatically printed into the System console. If you need, it can easily output the result into CSV, log file, JMX, even provide the servlet to remotely pass the result as JSON. Wonderful!!! 

With all these tools, I almost got the insurance on my system quality.


 final ConsoleReporter reporter = ConsoleReporter.forRegistry(Metrics.defaultRegistry())
                                                    .convertRatesTo(TimeUnit.SECONDS)
                                                    .convertDurationsTo(TimeUnit.MILLISECONDS)
                                                    .build();
reporter.start(1, TimeUnit.MINUTES);

Timer timer = Metrics.newTimer(this.getClass(),"StockPrice Batch Parse","timer",new SlidingWindowReservoir(nMax));

timer.update(stopwatch2.stop().elapsed(TimeUnit.NANOSECONDS),TimeUnit.NANOSECONDS);



-- Timers ----------------------------------------------------------------------
test.JSONParserTest.StockPrice Batch Parse.timer
             count = 34356
         mean rate = 95.49 calls/second
     1-minute rate = 95.70 calls/second
     5-minute rate = 88.47 calls/second
    15-minute rate = 80.06 calls/second
               min = 9.32 milliseconds
               max = 244.26 milliseconds
              mean = 10.46 milliseconds
            stddev = 2.38 milliseconds
            median = 10.07 milliseconds
              75% <= 10.64 milliseconds
              95% <= 11.99 milliseconds
              98% <= 13.57 milliseconds
              99% <= 22.14 milliseconds
            99.9% <= 31.03 milliseconds 





Saturday, October 26, 2013

SCTP vs UDT

When need a better messaging protocol for my project DTCrawler. I find these 2 new implementation. After study them, especially the book Networks for Grid Applications I choose UDT

1. built upon the UDP (which is my preference) 
2. Provide the flow/congestion management, which it is necessary for the application  

The difference is like this. 

"UDT borrows the messaging and partial reliability semantics from SCTP. However, SCTP are specially designed for VoIP and telephony, but UDT targets general purpose data transfer. UDT unifies both messaging and streaming semantics in one protocol."

Saturday, October 19, 2013

Java Performance Tuning Study Memo

Wonderful blogs from http://java-performance.info/!!! List of articles http://java-performance.com/

Java type memory usage


byte, boolean1 byte
short, char2 bytes
int, float4 bytes
long, double8 bytes
Byte, Boolean16 bytes
Short, Character16 bytes
Integer, Float16 bytes
Long, Double24 bytes
EnumSetBitSet1 bit per value
EnumMap4 bytes (for value, nothing for key)
ArrayList4 bytes (but may be more if ArrayList capacity is seriously more than its size)
LinkedList24 bytes (fixed)
ArrayDeque4 to 8 bytes, 6 bytes on average
JDK collectionSizePossible Trove substitutionSize
HashMap32 * SIZE + 4 * CAPACITY bytesTHashMap8 * CAPACITY bytes
HashSet32 * SIZE + 4 * CAPACITY bytesTHashSet4 * CAPACITY bytes
LinkedHashMap40 * SIZE + 4 * CAPACITY bytesNone
LinkedHashSet32 * SIZE + 4 * CAPACITY bytesTLinkedHashSet8 * CAPACITY bytes
TreeMap, TreeSet40 * SIZE bytesNone
PriorityQueue4 * CAPACITY bytesNone
All Java objects start with 8 bytes containing service information like object class and its identity hash code (returned by System.identityHashCode method). Arrays have 4 more bytes (one int field) containing array length. It looks like all user-written (not JDK classes) have another reference to object Class. These fields are followed by all declared fields. All objects are aligned by 8 bytes boundary. All primitive fields must be aligned by their size (for example, chars should be aligned by 2 bytes boundary).Object reference (including any arrays) occupy 4 bytes. What does it mean for us? In order to get most use of available memory, all object fields must occupy N*8+4 bytes (4, 12, 20, 28 and so on). In this case 100% memory will contain useful data.

Java Boxing Type Caching

Byte, Short, LongCharacterIntegerFloat, Double
From -128 to 127From 0 to 127From -128 to java.lang.Integer.IntegerCache.high or 127, whichever is biggerNo caching

Java Performance Tips

Never use exceptions as return code replacement or for any likely to happen events (especially in not IO-bound methods!). Throwing an exception is too expensive - you may experience 100 times slowdown for simple methods.

Throwing an exception in Java is a very slow operation. Expect that throwing an exception costs you something between 100 and 1000 ticks in most cases.

Case Study

Tuesday, October 1, 2013

MYSQL Tips

1. Enable Log

SET GLOBAL log_output = 'TABLE';
SET GLOBAL general_log = 'ON';
SELECT * FROM MYSQL.GENERAL_LOG ORDER BY EVENT_TIME DESC LIMIT 100;

Sunday, September 22, 2013

Memory Hierachy

Get this from http://dank.qemfd.net/dankwiki/images/d/dc/Memoryhierarchy.png


Thursday, September 12, 2013

MySQL INSERT ON DUPLICATE UPDATE IS FASTER THAN UPDATE!!!

It is very strange. But the test result show it is.

MySQL

innodb_version                  5.6.13
protocol_version                10
version                               5.6.13-enterprise-commercial-advanced
version_compile_machine x86_64
version_compile_os           osx10.7

Result

SELECT udf_CreateCounterID(0,CURRENT_DATE);

SELECT @update,@updateend,@updatediff,@insertupdate,@insertupdate_end,@insertupdatediff,@keyval,@countlmt;

@update=2013-09-12 17:32:27
@updateend=2013-09-12 17:33:01
@updatediff=34

@insertupdate=2013-09-12 17:32:00
@insertdate_end=2013-09-12 17:32:27
@insertupdatediff=27

@keyval=13
@countlmt=1000000

Table

CREATE TABLE `sys_CounterID` (
  `exch_year` int(11) NOT NULL,
  `nextID` int(11) NOT NULL,
  PRIMARY KEY (`exch_year`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


Test Function

CREATE DEFINER=`root`@`localhost` FUNCTION `udf_CreateCounterID`(exchID SMALLINT, listyear DATE) RETURNS int(10) unsigned
BEGIN
 /**
 counter ID is 32 bits, 
 highest 9 bits: exchange ID (until 2013,  totally 317 operator MIC. for any >511, modular 512)
 middel 7 bits: 2 digits year (max:99)
 left bits: counter number
 */
 DECLARE keyvalue INT UNSIGNED DEFAULT 0;
 
 SET @countlmt = 1000000;
 SET keyvalue = ((exchID % 512) << 9 ) + EXTRACT(YEAR FROM listyear) % 100;

 SET @keyval = keyvalue;
 SET @retVal =  0;

 SET @count = @countlmt;
 SET @insertupdate = SYSDATE();

 WHILE @count > 0 DO

  INSERT INTO `sys_CounterID`(`exch_year`,nextID)
  VALUE( keyvalue, 1)
  ON DUPLICATE KEY UPDATE 
   nextID = (@retVal := nextID + 1);

  SET @count = @count - 1;

 END WHILE;

 SET @insertupdate_end = SYSDATE();
 SET @insertupdatediff = TIMESTAMPDIFF(SECOND, @insertupdate,@insertupdate_end);

 
 SET @count = @countlmt;
 SET @update = SYSDATE();
 
 WHILE @count > 0 DO

  UPDATE sys_CounterID 
  SET nextID = (@retVal := nextID + 1)
  WHERE exch_year = keyvalue;

  SET @count = @count - 1;

 END WHILE;

 SET @updateend = SYSDATE();
 SET @updatediff = TIMESTAMPDIFF(SECOND, @update,@updateend);


 RETURN @retVal;

END


Monday, August 26, 2013

理不辩不明

这两天追完了薄熙来的文字转播。不得不佩服薄熙来的功力。虽然一直不是很喜欢他,而在他搞了唱红之后尤其讨厌他做秀的模样。这次审判倒是让我眼前一亮。在看完了他及证人的证词后,感觉到一种莫名的悲哀。这帮人到底每天都是一个什么样的心理状态呀!在杯觥交错之后入眠前的1分钟,是否会感到一种落寞。办公室里充满了算计,家里也是机关重重。那样的日子有意思吗? 

相信他是一个工作能力很强的人,就这样把他拉下马和他过去种种的极具争议的唱红打黑难道仅仅是他个人的过错?在我看来他的结局其实在他进入那样一个只唯上和绝对的权利的体系时,就已经注定他的结局了。如果通过做秀而由下至上的改变上头的意思,那就不是现在的中国了。

一个在这样的贪腐风行的官场,如果真的就只有500万公款和一套别墅,以他的权位也真的是够清廉了!!!