Ipphones

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 31 March 2009

Speed with a Catch

Posted on 20:02 by Unknown
A while back, I wrote a post about surface normals in OpenGL ES. Yesterday on Twitter, there was some discussion about using the inverse square root function from Quake 3 to speed up the performance of iPhone OpenGL ES applications. Here is what that method looks like (converted to using GL and iPhone data types):

static inline GLfloat InvSqrt(GLfloat x)
{
GLfloat xhalf = 0.5f * x;
int i = *(int*)&x; // store floating-point bits in integer
i = 0x5f3759d5 - (i >> 1); // initial guess for Newton's method
x = *(GLfloat*)&i; // convert new bits into float
x = x*(1.5f - xhalf*x*x); // One round of Newton's method
return x;
}

The inverse square root can be used in several ways. Noel Llopis of Snappy Touch pointed out two uses for it on Twitter yesterday: calculating normals and doing spherical UV texture mapping. I'm still trying to wrap my head around the UV Texture Mapping, but I understand normals pretty well at this point, so I though I'd see what kind of performance gains I could get using this old optimization. There's all sorts of arguments around the intertubes about whether this function still gives performance gains, but there's an easy way to find out: use it and measure with Shark.

I used my Wavefront OBJ Loader as a test, and profiled the loading of the most complex of the three objects - the airplane. The first run was using my original code, which stupidly1 used sqrt(). I then re-ran it using sqrtf(), and then again using the Quake3D InvSqrt() function above.

The results were impressive, and you definitely do get a performance increase from using this decade-old function on the iPhone. Using InvSqrt() gave a 15% decrease in time spent calculating surface normals over using sqrtf() and a 40% decrease over calculating with sqrt(). That's not an amount to be sneezed at, especially in situations where you need to calculate normals on the fly many times a second.

Now, if you remember, this was how we calculated normals using the square root function from Math.h:

static inline GLfloat Vector3DMagnitude(Vector3D vector)
{
return sqrt((vector.x * vector.x) + (vector.y * vector.y) + (vector.z * vector.z));
}

static inline void Vector3DNormalize(Vector3D *vector)
{
GLfloat vecMag = Vector3DMagnitude(*vector);
if ( vecMag == 0.0 )
{
vector->x = 1.0;
vector->y = 0.0;
vector->z = 0.0;
}

vector->x /= vecMag;
vector->y /= vecMag;
vector->z /= vecMag;
}

So... how can we tweak this to use inverse square root? Well, the inverse square root of a number is simply 1 divided by the square root of that number. In Vector3DNormalize(), we divide each of the components of the vector (x,y,and z) by the magnitude of the vector, which is calculated using square root. Since dividing a value by a number is the same as multiplying by 1 divided by that same number, so, we can just multiply each component by the inverse magnitude instead, like so:

static inline GLfloat Vector3DFastInverseMagnitude(Vector3D vector)
{
return InvSqrt((vector.x * vector.x) + (vector.y * vector.y) + (vector.z * vector.z));
}

static inline void Vector3DFastNormalize(Vector3D *vector)
{
GLfloat vecInverseMag = Vector3DFastInverseMagnitude(*vector);
if (vecInverseMag == 0.0)
{
vector->x = 1.0;
vector->y = 0.0;
vector->z = 0.0;
}

vector->x *= vecInverseMag;
vector->y *= vecInverseMag;
vector->z *= vecInverseMag;
}


Sweet, right? If we now use Vector3DFastNormalize() instead of Vector3DNormalize(), and each call will be about 15% faster on current generations of the iPhone and iPod Touch compared to using the built-in square root function.

But… there's a catch. Actually, two catches.

The Catches


The first catch is that this optimization doesn't work faster on all hardware. In fact, on some hardware, it is measurably slower than using sqrtf(). That means you're gambling that future hardware will also benefit from this same optimization. Not a huge deal and very possibly a safe bet, but you should be aware of it, and be prepared to back it out quickly should Apple release a new generation of iPhones and iPod Touches that use a different processor.

The second, and far more important catch is the possible legal ramifications of using this code. You see, Id released Quake3D's source code under the GNU Public License, which is a viral license. If you use source code from a GPL project, you have to open source your entire project under the GPL as well. Now, that's an oversimplification, and there are ways around the GPL, but as a general rule, if you use GPL'd code, you have to make your code GPL also.

But, the waters are a little murky. John Carmack has admitted that he didn't write that function, and doesn't think the other programmers at Id did either. The actual author of the code is unknown. Some of the contributors to the function have been found, but not the original author. That means the code MIGHT be in the public domain. If that's the case, its inclusion in a GPL application doesn't take it out of the public domain.

So, bottom line: is it safe to use? Probably. This function is widely known and widely used and there's been no indication that any possible rights owner has any interest in chasing down every use of this function. Are there any guarantees? Nope.

My recommendation is to use it, but make sure every place you use it, have a backup method that you can fallback on if you need to. If you want some assurance, you could try contacting Id legal and getting a waiver to use that function. I don't know if they'll respond, or if they'll grant it, but the folks at Id have always struck me as good people, so it might be worth an inquiry if you're risk averse.

1 - sqrt() is a double-precision function. Since OpenGL ES doesn't support the GLDouble datatype, which means I was doing the calculation on twice as many bits as needed, and converting back and forth from single to double precision then back again.
Email ThisBlogThis!Share to XShare to Facebook
Posted in Game Programming, Open Source, OpenGL ES, Optimizations | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Making OpenGL ES Screenshot
    The Bit-101 Blog has an entry that shows how to take a screenshot when using OpenGL ES . I tested this in my much-delayed particle-generato...
  • Adding CLANG to Your Build Process
    Frasier Spiers has a nifty piece this morning on using Git pre-commit hooks to automatically run the CLANG Static Analyzer. I'm not a G...
  • CLANG Static Analyzer
    If you aren't using the LLVM/Clang Static Analyzer , you really should be. The Clang Project is an attempt to write a front end for the...
  • A Little Help
    I'm having a problem with OpenGL ES, and it's keeping me from finishing my particle engine post. I was hoping someone here could see...
  • WWDC Accommodations
    Staying downtown in San Francisco is very expensive in the summertime. Bu, if you're going to WWDC, you really want to stay downtown. Yo...
  • Xcode File Templates and a Mystery
    One of the things that confuses many newcomers to Xcode is how to set it up so that your company name gets automatically filled in when you ...
  • Brain Surgery?
    Craig Hockenberry has an interesting post on his blog today about the iPhone background processing issue. Craig speaks from personal experi...
  • Book's Almost Done
    I just finished Chapter 16. I'll give it another read-over in the morning then it will go off to my writing partner for his review, then...
  • iPhone Alley
    Looks like Dave and I are going to make an appearance on the iPhone Alley Podcast next week. We're recording on Sunday night, so I woul...
  • Shuffling Arrays
    Ever want to randomize an array of items? It's a task that, for some reason, I've had to do a lot in recent programs. So, I wrote a ...

Categories

  • 3D Models
  • Ad Hoc Distribution
  • ADC
  • Address Book
  • Amazon
  • Anaglyphs
  • App Store
  • Apple
  • Apple DTS
  • Apple Store
  • Application Store
  • articles
  • Award
  • Background Processing
  • Barcodes
  • Beta
  • Blog
  • Blogger
  • Blogging
  • Blogs
  • Blogspot
  • Book project
  • Bug Reporting
  • Captain Obvious
  • Categories
  • Censorship
  • CFFoundation
  • CGAffineTransform
  • Clang Static Analyzer
  • Cocoa
  • Cocoa Touch
  • Code Reuse
  • Code Signing
  • Computer
  • conferences
  • Controller Classes
  • Core Animation
  • Daring Fireball
  • Database
  • Debugging
  • Defect
  • Delegates
  • Design Awards
  • Developer Certifications
  • Discussion Forums
  • Edit Mode
  • employment opportunities
  • Encryption
  • Enterprise
  • Errata
  • free code
  • Free software
  • Full Screen
  • Game Programming
  • Gestures
  • Getting Started
  • goof
  • Google Code
  • Google Maps
  • Gotcha
  • Help
  • HIG
  • HTTP PUT
  • Idiots
  • Idle Timer
  • Images
  • Instruments
  • Interface Builder
  • iPHone
  • iPhone Applications
  • iPhone Dev Center
  • iPhone Developers
  • iPhone OS 3.0
  • iPhone SDK
  • iPhone SDK PNG
  • iPhone Simulator
  • iPhoneSDK
  • iPod
  • Job Opportunities.
  • k
  • Key Value Observing
  • Keynote
  • KVO
  • Landscape Mode
  • Learn Cocoa
  • Learn Cocoa on the Mac
  • libxml
  • Licensing
  • Mac Developers
  • Mac OS X
  • Macworld Expo
  • Microsoft
  • NDA
  • NeHe
  • New Category
  • New Release
  • NSFileHandle
  • NSMutableArray
  • NSMutableURLRequest
  • NSXML
  • Object-Oriented Design
  • Objective-C
  • Open Source
  • OpenGL ES
  • Optimizations
  • Other blogs
  • Paired Arrays
  • Parsing
  • Particle Engine
  • Party
  • PeopleSoft
  • Performance
  • Persistence
  • Pink Screen of Death
  • Piracy
  • Pixar
  • Podcasts
  • Press Release WTF
  • Press Releases WTF
  • private APIs Google
  • Project Template
  • Properties
  • Random Numbers
  • Rant
  • Rejected
  • Resources
  • Responder Chain
  • REST
  • Reverse Engineering
  • Rumors
  • Runtime
  • Sample Code
  • Screencast
  • screenshot
  • Scroll Views
  • snippet
  • Snow Leopard.
  • SOAP
  • Sockets
  • Source
  • Splash Screen
  • SQLite
  • SQLitePersistentObjects
  • Steve Jobs
  • Steve-Note
  • Strings
  • Stupidity
  • Subversion
  • Table Views
  • Taps
  • Template
  • Tip
  • Tips
  • Tririga
  • tutorials
  • Twitter
  • UIAlertView
  • UIColor
  • UIImage
  • UIPickerView
  • UIScrollView
  • UITextField
  • UIView
  • UIWebView
  • Update
  • Utilities
  • UUID
  • Vacation
  • Version Control
  • Web Services
  • Writing
  • WTF
  • WWDC
  • Xcode
  • XML

Blog Archive

  • ▼  2009 (141)
    • ►  May (14)
    • ►  April (30)
    • ▼  March (48)
      • Speed with a Catch
      • Apple Packaging
      • WWDC First Time Guide
      • WWDC Accommodations
      • Wavefront OBJ Loader Open Sourced to Google Code
      • Apple Store LA Book Sighting
      • Differences in Delegation
      • Icons for Multiple Developer Tool Installs
      • NSConference
      • Xcode Single Window Mode
      • The Greatest Week of the Year
      • WWDC Was Announced - June 8 - 12
      • One Year In
      • Limiting Text Field Input
      • Updated to the Kotaku / Refund Clause Issue
      • Kotaku and the Technicolor Contract Clause
      • Rumor Mill
      • Resuable Reusable Classes
      • Guess Where These Were Taken…
      • Magnifying Glass in a Text View inside a Table Vie...
      • Image Processing on the iPhone
      • A Freebie
      • Version Control is Your Friend
      • Something I CAN Tell You...
      • Wish I Could Say More
      • A Word of Caution about SDK 3.0
      • On the fate of SQLitePersistentObjects…
      • iPhone OS 3.0
      • Particle Generator Bugfixes
      • Debugging Part 2
      • Debugging
      • March 17th
      • New iPod Shuffles
      • Review Silliness
      • Tough Love from 37Signals
      • Brutal Honesty from Owen Goss
      • Five Fingers Bundle
      • Yes, Yes You Can.
      • Becoming Indie
      • Geeking Out (including my Trip to the Mothership)
      • Bigger version of video:
      • Video of the Particle Generator in Action
      • At Last, Particle Generator
      • UIImage and NSCoding
      • 360ing
      • Publicity
      • Updated SQLPO Presentation
      • SQLitePersistentObjects Presentation
    • ►  February (26)
    • ►  January (23)
  • ►  2008 (163)
    • ►  December (46)
    • ►  November (25)
    • ►  October (44)
    • ►  September (2)
    • ►  August (5)
    • ►  July (2)
    • ►  June (9)
    • ►  May (2)
    • ►  April (11)
    • ►  March (17)
Powered by Blogger.

About Me

Unknown
View my complete profile