• sampling doesn’t effect the performance of the program being profiled, but it will falsely report IO bound functions as not consuming much time (because a function that is waiting for IO is not consuming any samples)
  • instrumenting correctly reports IO bound functions as high time consumers, however instrumenting effects the performance of the program being profiled

You need to profile some code - to determine which parts of the code are consuming the most time.

Sampling

Most profilers offer this approach to profiling. The profiler will sample the stack of the program periodically (e.g 1000 times per second). The stack will tell the profiler what function the CPU is currently executing, and the call path that that function took.

Once the profiling session is done, all the sampled stack data is analyzed and reports are generated.

One of the reports shows you a table of functions along with the amount/percentage of samples that were spent in that function. The idea is: The higher the percentage of samples that were spent in a function, the more time that function took (this isn’t entirely correct though, and we’ll cover this in just a bit).

For each function it shows you time inclusive and time exclusive measurement. The time inclusive measurement includes any time (samples really) that was spent on the target function as well as any functions it called. The time exclusive measurement only tells you the amount of time (again, samples really) that was spent in the target function, it does not count any time that was spent in functions it called. In other words, time exclusive takes the amount of samples that were spent in the target function, and then subtracts the amount of samples that were spent in any functions it called. Time exclusive tells you the amount of work the function itself did (not its sub-functions).

If a function does very little CPU work, and then is waiting on say IO for 10 seconds, it will not consume a lot of the profiler’s “samples” because it is not actively being executed by the CPU, it is waiting for an IO operation to complete. Therefore, even though this function consumes a lot of time (because it waits for IO for 10 seconds), it still consumes very little samples, therefore it won’t show up ranked high on the profiler’s results. This problem is solved by our second method of profiling (however it has its own downfalls).

Instrumenting

Most profilers also offer this approach of profiling. In instrumentation, the profiler will actually put in instrumenting statements in the functions so that it can detect each time a function starts/ends. This allows the profiler to detect those functions that may be waiting for IO or something else, as high time consuming functions.

The downfall of instrumenting is that it changes the performance behavior of what you are profiling! Different program’s performance will be effected by different amounts due to instrumentation, but all programs are effected at least a bit.

If you wanna be able to do instrumenting profiling in visual studio, you must go to project properties -> linker -> advance and for the “profile” option, set it to “yes”.