Tuesday, July 31, 2012

Improved XLSX Load Time

Although I already did a little work on xlsx file import performance at the beginning of this year's GSOC, Kohei and Markus have now changed my focus from ods import performance to xlsx formula import performance improvements.  Kohei made a large xlsx test file that contains one sheet with 5 columns and 20,000 rows of formula cells.  This file took a long time for the current master (LibreOffice 3.7) to load.  I did some profiling on the test file and saw one area of code that was unnecessarily being called repeatably which was wasting a lot of time during import.  So I made some changes and tested the load times.  The load time change was quite dramatic.

Please keep in mind these are not rigorously, scientifically performed tests.  They are just to give an idea of the improvements we are making.  I did these tests on a machine with a 3.2GHz AMD Athon 64 X2 Dual Core Processor 6400+ and 8GB of RAM running 64-bit GNU/Linux.  I used LibreOffice 3.5.4 and the latest build of my feature branch which gradually gets merged to master (LibreOffice 3.7).


LibreOffice 3.5.4 took 8 minutes and 30 seconds to load this file.  Before this change, LibreOffice 3.7  took 51 seconds to load this file, but after this change, it only took 6 seconds to load the file!  That's an 88% reduction on LibreOffice 3.7 from before the change to after the change, and a 99% reduction from LibreOffice 3.5.4!

More matrix improvements

Kohei Yoshida has been doing some awesome work on the ScMatrix backend. The hope is that this eventually will result in all-around better performance of matrices. Check out Kohei's post here: http://kohei.us/2012/07/20/mdds-multi_type_vector-explained/

Thursday, July 12, 2012

Improved ODS Load Times

Edited to include links to the test files that were used.

Under the guidance of Kohei Yoshida and Markus Mohrhard, I have been working to shorten the time it takes to open an ODS file.  These shorter load times may be more noticeable depending on the content and size of the file that is being loaded.  Although this work should result in all ODS files having at least a small improvement in load times, let's take a look at a couple of extreme cases.

Please keep in mind these are not rigorously, scientifically performed tests.  They are just to give an idea of the improvements we are making.  I did these tests on a machine with a 3.2GHz AMD Athon 64 X2 Dual Core Processor 6400+ and 8GB of RAM running 64-bit GNU/Linux.  I used LibreOffice 3.5.4 and the latest build of my feature branch which gradually gets merged to master (LibreOffice 3.7).

Test 1 used an ODS file with a single sheet containing 10,001 rows and 149 columns of numbers and simple formulas.  LibreOffice 3.5.4 takes 20 seconds to load this file, and LibreOffice 3.7 takes 16 seconds.  That's four seconds quicker which is a 20% reduction in load time.  Ok, that's not bad.

Test 1


Test 2 used an ODS file with a single sheet containing 5,230 rows and 189 columns of matrix cells with complex formulas.  LibreOffice 3.5.4 takes 26 seconds to load this file, and LibreOffice 3.7 takes 13 seconds.  That's 13 seconds quicker which is a 50% reduction in load time!

Test 2


Also, take a look at Kohei's post to see some other awesome ODS load time improvements made last year.  (I attempted to mimic his format for the load time charts and drew inspiration for my whole blog post from his post.)

So we've made progress, but we are all continuing to work to improve LibreOffice for a better user experience.

Monday, July 9, 2012

Performance Increase Results Coming Soon

So far my focus has been on improving the performance of ODS import. (Reducing the load time of spreadsheets that are in the ODS file format.) Since I am just starting this blog in July, I have some catching up to do. I plan to make some before and after builds to show the progress of my work and post the difference in load times on this blog.

A General Idea of My LibreOffice Experiences

Much of my time is spent learning how LibreOffice Calc works. I definitely spend a lot more time doing activities such as reading code, understanding the code's intent, and analyzing the flow of code than I do actually writing code myself. This learning experience can be both enjoyable and frustrating... sometimes both at the same time. Even so, like I have said before on IRC, Kohei and Markus are the ones with the master plan; I'm just doing the grunt work. ;-)

On top of getting more experience with coding (C++) and analysis, I have also had the opportunity get familiar with some very useful development tools such as Git, GDB, Callgrind, OpenGrok, and Doxygen. Markus has also encouraged me to write unit tests for any feature that I touch that isn't already covered by a test and to improve tests that already exist. I have seen first hand several times how useful it is to have the unit tests catch features that one might accidentally break after modifying code.

GSOC 2012 - LibreOffice

I have the honor of being one of the Google Summer of Code (GSOC) students selected to work on LibreOffice this year. My project is to improve the performance of LibreOffice's spreadsheet software, Calc. My mentors are Kohei Yoshida and Markus Mohrhard. They have continually and patiently answered my questions, provided me with guidance, and well... mentored me. :-) Also, I have to say that all of the LibreOffice community that I have encountered have a general atmosphere of friendly helpfulness and openness to anyone who seriously wants to contribute. This is my first experience contributing to open source software, and I am loving it!

Although I have been working on and getting familiar with LibreOffice since early May, I have just now began to blog about my work per the recommendation of Markus. Until now, I have updated my mentors with IRC conversations, emails, and commits. I have never been the blogging type, but I can see the value in it for GSOC and work in general on LibreOffice.