Ipphones

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 10 April 2008

The Root of All Evil: Introduction to Optimizing Objective C

Posted on 14:27 by Unknown
Donald Knuth has a very famous quote: "[P]remature optimization is the root of all evil"1. This is a valuable idea to keep in your head while programing. It's actually okay to write inefficient code at times. In the context of Objective-C, this means it's okay to use provided methods and objects in your first pass at writing algorithms, even if you know that those objects or methods might not be the optimal way to do the task in terms of performance. Get the code working, and get it working right, and then go back and optimize intelligently where it make sense. In many places small inefficiencies will have no noticeable affect on your application, and by leveraging thoroughly-tested code from Cocoa which was written by knowledgeable Apple and NeXT engineers, you likely saved yourself considerable time and maintenance headaches.

But, there are times when inefficient code has an impact and needs to be addressed. In Java, Visual Basic, Ruby, and the other high-level languages, there is a lot you can do to optimize your code, but there is a limit to what you can do because those languages insulate you from the inner workings using mechanisms such as garbage collection, and a lot of the code that is running when you run your code is completely outside of your control. In all of those languages, you can help the runtime know when it's okay to free up memory that you're done with it by following good coding practices, but memory management is mostly out of your hands. Other optimizations like loop unrolling can be used in these languages, but the performance yield tends to be less than from the same optimizations in a low-level language compiled directly to machine code.

Here's where Objective-C really has a leg up. It is a high-level language and has a lot in common with Java, VB, and C#. It has garbage collection (well, not on the iPhone, yet) and exception handling and it shields you from many low-level implementation details as a good high-level language should. But, Objective-C is a super-set of C, and even though it does have a small runtime engine to enable object messaging and other features, most of your code gets compiled down to machine language. You are shielded from the low-level implementation details, but not prevented from getting to them. When you have exhausted your options for optimizing using Objective-C and Cocoa, you always have the option to lift up the hood and use good old-fashioned procedural C to optimize even more. A great Objective-C programmer needs to be at least a good C programmer.

Let's take a look at an example of optimizing a small chunk of Objective-C code.

Let's say that we have an NSData that contains encoded binary data stored as 8-bit ASCII characters. In this encoded data,  we need to look for certain sequences of characters to make sure that it contains the type of data it's supposed to. These tags are simple sequences of ASCII characters (aka c-strings). 

Here is my first, quickly written version leveraging built-in Cocoa objects. This was written as part of a category on NSData, so any reference to self is a reference to the NSData object holding the encoded binary data.

I apologize for the code formatting. Blogger is not cooperating and refuses to believe me when I tell it that my code is preformatted using the pre tag.

The Original, Horribly Non-Optimized Version




-(BOOL)containsCString:(char *)cmp
{
NSUInteger offset = 0;
char * byte;
const char * bytes = [self bytes];

while (offset < [self length])
{
byte = (char *)bytes ;
NSString * checkString = [NSString stringWithCString:byte length:strlen(cmp)];
if ([checkString isEqualToString:[NSString stringWithCString:cmp length:strlen(cmp)]])
return YES;

offset++;
}
return NO;
}

Shall I leave it as an exercise for the student to identify just why this code is poor from an optimization standpoint? I'd better not, though the problems are likely obvious to many of you.

First of all, I'm allocating autoreleased objects inside of a loop. And that loop goes byte by byte through a potentially large chunk of data. None of the memory for those objects will be released until the end of the current pass through the run loop, which won't happen until after my loop finishes. I create two autoreleased NSString instances for every byte in the data and then run a string compare between them. NSString is a complex class cluster designed to deal with a multitude of string encodings, and could potentially be using multi-byte characters to represent these strings, meaning that each compare could be comparing two or even three bytes rather than just one. Now, the NSString class cluster is smart and probably won't use multi-byte encoding unless it needs to, but since that's not our code and the algorithms are not documented, we can't know for sure what it's going to do, and it's possible that something in our data could trigger the use of multi-byte encoding

So, basically, I'm being very, very inefficient with both memory and processing power here. That may not be important, depending on how I'm using this in my application, and this version only took me a couple of minutes to write, so if I do this rarely and the chunks of encoded data are small, perhaps this is going to be fine just the way it is. But there's the potential for real problems with this code in an application. 

One way to find out if it's going to be a real problem is to set up unit tests (you do use unit tests, right?) that use representative situations similar to what the code will face in the actual application.

For grins and giggles, let's run MY unit test on this method, which takes two three-megabyte chunks of encoded data - one that contains the tags being searched for and one that doesn't - and do two searches on both chunks. The result? 6.8  seconds to do the two searches on each of the two three-megabyte chunks running it on a a 2.6ghz MacBook Pro. That's definitely a potential Spinning Beachball of Death situation. I guess we'd better look at optimizing, huh?

Note that the amount of time it takes to run this code will depend on a number of factors, and it will not be the same every time, but it only varied by a few milliseconds over a dozen runs, so I'm pretty confident it's the algorithm that's slow and not something else running at the same time2.

First Optimization

The first and most obvious optimization here is getting the comparison string allocation out of the loop. That value does not change at all during the loop, so let's move it up to before the while statement so that only happens once per call to this method. In theory that should significantly reduce memory overhead and probably reduce the processing time significantly as well. Here's the new version:

-(BOOL)containsCString:(char *)cmp
{
NSUInteger offset = 0;
char * byte;
const char * bytes = [self bytes];
NSString *compareString = [NSString stringWithCString:cmp length:strlen(cmp)];
while (offset < [self length])
{
byte = (char *)bytes ;
NSString * checkString = [NSString stringWithCString:byte length:strlen(cmp)];
if ([checkString isEqualToString:compareString])
return YES;

offset++;
}
return NO;
}

Let's run the unit test and see what happens. And the result? This time it runs in a little faster, clocking in at 3.7 seconds. A significant improvement, but not fast enough to avoid that spinning beach ball. Perhaps I'm going to have to spawn threads. I'm going to be doing this encoding a lot in my application. 

So, is there a way I can improve it more? Is there any way for me to avoid autoreleasing so many NSStrings, or even creating them at all? Well, there are a couple of options. I could manually allocate and release the string objects. That would be considerably more efficient than filling up the autorelease pool, in theory. 

The Second Optimization

Here's what we get if we manually allocate and release our NSString instances rather than letting them go into the autorelease pool. This should help with our memory overhead, but it's unclear how much of a speed increase it will give us:


-(BOOL)containsCString:(char *)cmp
{
NSUInteger offset = 0;
char * byte;
const char * bytes = [self bytes];
NSString *compareString = [NSString stringWithCString:cmp length:strlen(cmp)];
while (offset < [self length])
{
byte = (char *)bytes ;
NSString *checkString = [[NSString alloc] initWithCString:byte length:strlen(cmp)];
if ([checkString isEqualToString:compareString])
return YES;
[checkString release];
offset++;
}
return NO;
}
We run it and? 3.44 seconds. Hm.. An improvement, but not a huge one by any means. It does appear that creating strings is one of our bottlenecks. In a more complex method, we would probably have to resort to using Shark or Instruments to identify where our bottlenecks were, but in this case, it's pretty obvious where the problem is.

So, let's look at doing this in straight C. After all, we're just comparing chunks of memory. C can do that, and it can do it fast without the overhead of objects or string encodings or anything else.

The Final Optimization

Instead, let's forget about using NSString or NSMutableString, and just use C. We'll work with pointers, and use some old-school techniques for comparing buffers.



-(BOOL)containsCString:(const char *)cmp
{
NSUInteger offset = 0;
const unsigned char * bytes = [self bytes];

while (offset < [self length])
{
if (memcmp(bytes++, cmp, strlen(cmp))==0)
return YES;

offset++;
}
return NO;
}

This version doesn't create any new Objective-C objects, it simply uses a C function that compares to chunks of memory. Now there is no overhead due to the objects, and none due to string encodings or conversions. 

So, just how much faster is this? Let's run our unit test and find out. How about 0.4 seconds? Not terrible at all for doing a byte-by-byte search through 12 megs of data, especially when half of the data it had to search didn't contain the thing it was looking for.

Concluding Thoughts


We could take this further. There are optimizations we could do to this that would eke out several more milliseconds, and if our code needed to process huge chunks of data, that would probably be worth investigating (and might be a good idea for a future posting), but for my purposes, I've decided that this is good enough performance for my needs. The point of this article was not really to teach you how to optimize your code, and certainly is not showing you the fastest possible implementation of this algorithm, but I did want to drive home the point that Objective-C is C, and that as nice as all the object-oriented goodness is, there are times when it makes sense to roll up your sleeves and do things old school, an option that most high-level languages don't offer, but Objective-C does. I also wanted you to see some of the rather common kinds of mistakes you often see in high-level languages that cause memory bloat and performance bottlenecks so that you'll be less likely to make them in the future.

The timings in this article were all done using the Debug target. The final optimization dropped down to .1 seconds by switching to the Release target, and I was able to drop it even lower by implementing a reverse iterator for the end tag.


1 - Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268
2 - It does, indeed, help to know that I intentionally wrote this inefficiently.

Email ThisBlogThis!Share to XShare to Facebook
Posted in Cocoa, Cocoa Touch, iPHone, iPhone SDK, Objective-C, Optimizations | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Making OpenGL ES Screenshot
    The Bit-101 Blog has an entry that shows how to take a screenshot when using OpenGL ES . I tested this in my much-delayed particle-generato...
  • Adding CLANG to Your Build Process
    Frasier Spiers has a nifty piece this morning on using Git pre-commit hooks to automatically run the CLANG Static Analyzer. I'm not a G...
  • CLANG Static Analyzer
    If you aren't using the LLVM/Clang Static Analyzer , you really should be. The Clang Project is an attempt to write a front end for the...
  • A Little Help
    I'm having a problem with OpenGL ES, and it's keeping me from finishing my particle engine post. I was hoping someone here could see...
  • WWDC Accommodations
    Staying downtown in San Francisco is very expensive in the summertime. Bu, if you're going to WWDC, you really want to stay downtown. Yo...
  • Xcode File Templates and a Mystery
    One of the things that confuses many newcomers to Xcode is how to set it up so that your company name gets automatically filled in when you ...
  • Brain Surgery?
    Craig Hockenberry has an interesting post on his blog today about the iPhone background processing issue. Craig speaks from personal experi...
  • Book's Almost Done
    I just finished Chapter 16. I'll give it another read-over in the morning then it will go off to my writing partner for his review, then...
  • iPhone Alley
    Looks like Dave and I are going to make an appearance on the iPhone Alley Podcast next week. We're recording on Sunday night, so I woul...
  • Shuffling Arrays
    Ever want to randomize an array of items? It's a task that, for some reason, I've had to do a lot in recent programs. So, I wrote a ...

Categories

  • 3D Models
  • Ad Hoc Distribution
  • ADC
  • Address Book
  • Amazon
  • Anaglyphs
  • App Store
  • Apple
  • Apple DTS
  • Apple Store
  • Application Store
  • articles
  • Award
  • Background Processing
  • Barcodes
  • Beta
  • Blog
  • Blogger
  • Blogging
  • Blogs
  • Blogspot
  • Book project
  • Bug Reporting
  • Captain Obvious
  • Categories
  • Censorship
  • CFFoundation
  • CGAffineTransform
  • Clang Static Analyzer
  • Cocoa
  • Cocoa Touch
  • Code Reuse
  • Code Signing
  • Computer
  • conferences
  • Controller Classes
  • Core Animation
  • Daring Fireball
  • Database
  • Debugging
  • Defect
  • Delegates
  • Design Awards
  • Developer Certifications
  • Discussion Forums
  • Edit Mode
  • employment opportunities
  • Encryption
  • Enterprise
  • Errata
  • free code
  • Free software
  • Full Screen
  • Game Programming
  • Gestures
  • Getting Started
  • goof
  • Google Code
  • Google Maps
  • Gotcha
  • Help
  • HIG
  • HTTP PUT
  • Idiots
  • Idle Timer
  • Images
  • Instruments
  • Interface Builder
  • iPHone
  • iPhone Applications
  • iPhone Dev Center
  • iPhone Developers
  • iPhone OS 3.0
  • iPhone SDK
  • iPhone SDK PNG
  • iPhone Simulator
  • iPhoneSDK
  • iPod
  • Job Opportunities.
  • k
  • Key Value Observing
  • Keynote
  • KVO
  • Landscape Mode
  • Learn Cocoa
  • Learn Cocoa on the Mac
  • libxml
  • Licensing
  • Mac Developers
  • Mac OS X
  • Macworld Expo
  • Microsoft
  • NDA
  • NeHe
  • New Category
  • New Release
  • NSFileHandle
  • NSMutableArray
  • NSMutableURLRequest
  • NSXML
  • Object-Oriented Design
  • Objective-C
  • Open Source
  • OpenGL ES
  • Optimizations
  • Other blogs
  • Paired Arrays
  • Parsing
  • Particle Engine
  • Party
  • PeopleSoft
  • Performance
  • Persistence
  • Pink Screen of Death
  • Piracy
  • Pixar
  • Podcasts
  • Press Release WTF
  • Press Releases WTF
  • private APIs Google
  • Project Template
  • Properties
  • Random Numbers
  • Rant
  • Rejected
  • Resources
  • Responder Chain
  • REST
  • Reverse Engineering
  • Rumors
  • Runtime
  • Sample Code
  • Screencast
  • screenshot
  • Scroll Views
  • snippet
  • Snow Leopard.
  • SOAP
  • Sockets
  • Source
  • Splash Screen
  • SQLite
  • SQLitePersistentObjects
  • Steve Jobs
  • Steve-Note
  • Strings
  • Stupidity
  • Subversion
  • Table Views
  • Taps
  • Template
  • Tip
  • Tips
  • Tririga
  • tutorials
  • Twitter
  • UIAlertView
  • UIColor
  • UIImage
  • UIPickerView
  • UIScrollView
  • UITextField
  • UIView
  • UIWebView
  • Update
  • Utilities
  • UUID
  • Vacation
  • Version Control
  • Web Services
  • Writing
  • WTF
  • WWDC
  • Xcode
  • XML

Blog Archive

  • ►  2009 (141)
    • ►  May (14)
    • ►  April (30)
    • ►  March (48)
    • ►  February (26)
    • ►  January (23)
  • ▼  2008 (163)
    • ►  December (46)
    • ►  November (25)
    • ►  October (44)
    • ►  September (2)
    • ►  August (5)
    • ►  July (2)
    • ►  June (9)
    • ►  May (2)
    • ▼  April (11)
      • The Dangers of Beta
      • Web Services Fun
      • Simplifying libxml
      • Web Services Fun
      • Golden Ticket
      • WWDC
      • Idiocy Can Come with Many Letters After Its Name
      • The Root of All Evil: Introduction to Optimizing O...
      • SDK Beta 3 is Out
      • You're Doing it Wrong! A Brief Discussion of Categ...
      • Pink Screen of Death
    • ►  March (17)
Powered by Blogger.

About Me

Unknown
View my complete profile