Wednesday, January 23, 2008

Sorting Files by Date in Java

When I need to sort files by date initially I under estimated the problem. A simple Java implementation using a TreeSet with comparator on file.lastModified() worked well for a files list up to 300k. Beyond that performance dropped significantly, more specifically while getting a list of files from remote shares.

From the stack dump it is clear that system is spending more time in getting last modified time from the file system. Below is the analysis about the problem situation. Every operation on tree set is in the order of logN. So adding N elements is in the order of N2logN . So if the number calls to remote file system increasing in this order with the number of files.
A solution for this problem is to use the cached version of the File object which is a sub class of java.io.File with cached properties of lastmodified time, full path and other required properties used by comparator for sorting. This brings down the number of calls for getting file properties will be decreased to N.

3 comments:

Ravikanth B V said...

Cool satya, now started learning from your blogs.

---
Ravikanth

saikrishna said...

Hi Satya, Nice to know you are bloging too ...

MKRao said...

Hi Satya, where are you now? I mean where are you working?