The Squeak VMProfiler is now available headless in Pharo.
It allows you to profile an expression and get detailed statistics about where the time was spent on the VM side (JIT, interpreter, GC).
It is still in development so feel free to use it and report any bugs. Suggestions and comments are welcome !
Loading the profiler
First, you need to load the project in a Pharo 6 image.
Make sure you are using one of the latest VM, as the profiler uses plugins that were not in the older ones. You can download the latest VM+image using this script :
Make sure you are using one of the latest VM, as the profiler uses plugins that were not in the older ones. You can download the latest VM+image using this script :
wget -O- get.pharo.org/60+vmLatest | bash
Then, you can either load the VMProfiler from the calatog browser (World Menu>Tools>Catalog browser) and install the stable version :
Or use the script below :
Gofer newurl: 'http://www.smalltalkhub.com/mc/Pharo/MetaRepoForPharo60/main';package: 'ConfigurationOfVMProfiler';load.((Smalltalk at: #ConfigurationOfVMProfiler) project version: #stable) load
Using the profiler
Profiling an expression
When the configuration is loaded, a workspace should open :
String streamContents: [ :s | VMProfiler headlessSpyOn: [ 1 to: 10000000 do: [ :i | Object new ] ] reportOn: s ]
It basically shows you how to use the profiler :
- you spy on the expression you want to profile (here, the expression is [ 1 to: 10000000 do: [ :i | Object new ] ])
- the profiling report is then stored as a string (here, s)
- You have to inspect the string to actually read the report
Try and inspect the example from the workspace : you will see the profiling report displayed on screen.
If you want to test it with your own expression, then just type :
String streamContents: [ :s | VMProfiler headlessSpyOn: [ myExpression ] reportOn: s
and inspect it. As the profiler is cadenced at 1.3 GHz, make sure your expression runs for quite a long time to actually get a large amount of samples, otherwise the results may not be significant.
Analyzing the report
The report is composed of different parts :
1- General informations about the profiling
- VM used and its path
- Memory-related data (size of the eden, number of stack pages, code size)
- Duration of the profiling, the sampling frequency
- Total number of samples (a sample = an instruction pointer) gathered during the profiling
2- The profiling report in itself
This part provides the computed profiling statistics, based on the number of samples and their locations.
It is subdivided in 2 parts :
It is subdivided in 2 parts :
- the time spent in the methods generated by the JIT compiler.
- the time spent in the c code of the VM (interpreter and GC)
3- Statistics about the memory
This part deals with the state of the memory after the profiling. It helps detecting potential memory leaks.
4- Statistics about the Garbage collection(s)
This part lists the operations performed by the garbage collector during the profiling. It can help identifying where to tune the GC, especially when a large (and potentially abnormal) amount of time is spent in one of these operations.
This part lists the operations performed by the garbage collector during the profiling. It can help identifying where to tune the GC, especially when a large (and potentially abnormal) amount of time is spent in one of these operations.
- full : the number of full GC performed and their total execution time.
- scavenges : the number of scavenges (eden to survivor area) performed and their cumulated execution time.
- tenures : the number of tenures (promotion to old space) performed.
- root table : the number of root table (old objects pointing at young objects) overflows
5- Statistics about Compiled Code Compactions
This part deals with the number of compiled code compactions and their cumulated execution time. This kind of compaction occurs when the Cog method zone is full. A high (and time consuming) number of compactions may show that you need to increase the Cog Method zone size.
This part deals with the number of compiled code compactions and their cumulated execution time. This kind of compaction occurs when the Cog method zone is full. A high (and time consuming) number of compactions may show that you need to increase the Cog Method zone size.
6- Statistics about Events
Like the previous parts, this part identifies other events that can occur during the profiling and can help you identify if something is wrong.
For instance, the size or the number of stack pages can be modified if a high amount of stack overflows is observed.
Next steps
As mentioned in my GSoC proposal, there are still plenty of things to do !
First, in the next few weeks, the profiling should get more detailed as the in-development JIT and its speculative inlining will require the mapping of the profiling samples to a range of bytecode to actually get a significant profiling.
Then, I will add a UI to the Pharo VMProfiler.
Aucun commentaire:
Enregistrer un commentaire