Skip to content

Commit 8189b0f

Browse files
authored
updated gpu opt notebooks with vtune details (#1980)
1 parent b08115a commit 8189b0f

4 files changed

Lines changed: 109 additions & 17 deletions

File tree

DirectProgramming/C++SYCL/Jupyter/gpu-optimization-sycl-training/01_Introduction_to_GPU_Optimization/01_Introduction.ipynb

Lines changed: 43 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
"- [Introduction](#Introduction)\n",
1111
"- [Phases in the Optimization Workflow](#Phases-in-the-Optimization-Workflow)\n",
1212
"- [Profiling and Tuning Your Code](#Profiling-and-Tuning-Your-Code)\n",
13+
" - [Analysis using Intel VTune Profiler](#Analysis-using-Intel-VTune-Profiler) \n",
1314
"- [Locality Matters](#Locality-Matters)\n",
1415
"- [Rightsize Your Work](#Rightsize-Your-Work)\n",
1516
"- [Parallelization](#Parallelization)\n",
@@ -53,10 +54,47 @@
5354
{
5455
"cell_type": "markdown",
5556
"id": "bf9e2b6a-fbe6-42fc-b4bc-aec2f4c65b07",
56-
"metadata": {},
57+
"metadata": {
58+
"jp-MarkdownHeadingCollapsed": true
59+
},
5760
"source": [
5861
"## Profiling and Tuning Your Code\n",
59-
"After you have designed your code for high performance, the next step is to measure how it runs on the target accelerator. Add timers to the code, collect traces, and use tools like VTune Profiler to observe the program as it runs. The information collected can identify where hardware is bottlenecked and idle, illustrate how behavior compares with peak hardware roofline, and identify the most important hotspots to focus optimization efforts."
62+
"After you have designed your code for high performance, the next step is to measure how it runs on the target accelerator. Add timers to the code, collect traces, and use tools like VTune Profiler to observe the program as it runs. The information collected can identify where hardware is bottlenecked and idle, illustrate how behavior compares with peak hardware roofline, and identify the most important hotspots to focus optimization efforts.\n",
63+
"\n",
64+
"#### Analysis using Intel VTune Profiler\n",
65+
"\n",
66+
"You will need this section later to analyze your code performance using Intel VTune Profiler when working with code examples in the different modules.\n",
67+
"\n",
68+
"##### Steps to VTune analysis:\n",
69+
"- Modify code and compile\n",
70+
"- Use VTune cmd line to collect profiling data\n",
71+
"- Open Vtune results using Intel VTune Profile GUI\n",
72+
" - If the system you are using does not have GUI, compress and download the VTune results directory and open the results on a GUI computer with Intel VTune Profiler installed.\n",
73+
"\n",
74+
"##### Detailed Steps to do VTune Analysis:\n",
75+
"\n",
76+
"- Modify the module's example code and then \"Build and Run\", this will generate the binary in `lab/a.out`\n",
77+
"- Then in \"Terminal\", go to the current module directory and run the following vtune command (change the `-result-dir` value from `vtune_data` to something that identifies your code) \n",
78+
"```\n",
79+
"vtune -collect gpu-hotspots -result-dir vtune_data $(pwd)/lab/a.out\n",
80+
"```\n",
81+
"- Compress the vtune results directory to copy to your location computer (GUI)\n",
82+
"```\n",
83+
"tar -cvf vtune_data.tgz vtune_data\n",
84+
"```\n",
85+
"- Download the compressed vtune results:\n",
86+
" - If using Jupyter, right click on the `*.tgz` file and select \"Download\"\n",
87+
" - If using `ssh`, use `scp` to copy the `*.tgz` to your GUI computer\n",
88+
"- Uncompress the vtune results files:\n",
89+
"```\n",
90+
"tar -xvf vtune_data.tgz\n",
91+
"```\n",
92+
"- On your computer, install \"Intel VTune Profiler\" from [__Intel oneAPI Base Toolkit__](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html)\n",
93+
"- Open __Intel VTune Profiler__ and select the option to \"Open Results\" in the \"Welcome\" tab and select the vtune results directory that was downloaded, select the *.vtune file.\n",
94+
"- Navigate to the \"Graphics\" tab and then \"Platform\" tab to analyze performance timeline and compute stats\n",
95+
"- Refer to VTune Profiler documentation for more information\n",
96+
"\n",
97+
"<img src=\"assets/vtune_profiler.png\">\n"
6098
]
6199
},
62100
{
@@ -155,9 +193,9 @@
155193
],
156194
"metadata": {
157195
"kernelspec": {
158-
"display_name": "Python 3 (Intel® oneAPI 2023.2)",
196+
"display_name": "Python 3 (ipykernel)",
159197
"language": "python",
160-
"name": "c009-intel_distribution_of_python_3_oneapi-beta05-python"
198+
"name": "python3"
161199
},
162200
"language_info": {
163201
"codemirror_mode": {
@@ -169,7 +207,7 @@
169207
"name": "python",
170208
"nbconvert_exporter": "python",
171209
"pygments_lexer": "ipython3",
172-
"version": "3.9.16"
210+
"version": "3.11.5"
173211
}
174212
},
175213
"nbformat": 4,
225 KB
Loading

DirectProgramming/C++SYCL/Jupyter/gpu-optimization-sycl-training/03_Memory_Optimization/031_Memory_Optimization_Buffers.ipynb

Lines changed: 33 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -127,10 +127,18 @@
127127
"<img src='assets/vtune_buffer_read_write.png'>\n",
128128
"<img src='assets/vtune_buffer_write_noinit.png'>\n",
129129
"\n",
130-
"Below is vtune command line to capture the `gpu-hotspots` data using a terminal and the resulting captured data can be viewed using Intel VTune Profiler:\n",
130+
"#### Analysis using Intel VTune Profiler\n",
131+
"\n",
132+
"Below is vtune command line to capture the `gpu-hotspots` data using a terminal and the resulting captured data can be viewed using Intel VTune Profiler GUI:\n",
133+
"\n",
134+
"- Modify the code above and then \"Build and Run\" the code in the section above\n",
135+
"- Then in \"Terminal\", go to the current module directory and run the following vtune command (change the `-result-dir` value from `vtune_data` to something that identifies your code) \n",
131136
"```\n",
132-
"vtune -collect gpu-hotspots -result-dir vtune_data $(pwd)/a.out\n",
133-
"```\n"
137+
"vtune -collect gpu-hotspots -result-dir vtune_data $(pwd)/lab/a.out\n",
138+
"```\n",
139+
"- Download the vtune results directory and open using Intel VTune Profiler GUI to do analysis.\n",
140+
"\n",
141+
"Detailed instructions for capturing VTune data and performing analysis is in the \"Introduction to GPU Optimization\" module under the \"Analysis using Intel VTune Profiler\" section.\n"
134142
]
135143
},
136144
{
@@ -750,6 +758,25 @@
750758
"! ./q.sh run_buffer_mem_move_3.sh"
751759
]
752760
},
761+
{
762+
"cell_type": "markdown",
763+
"id": "b3805632-b7d0-4620-b138-84e159575a59",
764+
"metadata": {},
765+
"source": [
766+
"#### Analysis using Intel VTune Profiler\n",
767+
"\n",
768+
"Below is vtune command line to capture the `gpu-hotspots` data using a terminal and the resulting captured data can be viewed using Intel VTune Profiler GUI:\n",
769+
"\n",
770+
"- Modify the code above and then \"Build and Run\" the code in the section above\n",
771+
"- Then in \"Terminal\", go to the current module directory and run the following vtune command (change the `-result-dir` value from `vtune_data` to something that identifies your code) \n",
772+
"```\n",
773+
"vtune -collect gpu-hotspots -result-dir vtune_data $(pwd)/lab/a.out\n",
774+
"```\n",
775+
"- Download the vtune results directory and open using Intel VTune Profiler GUI to do analysis.\n",
776+
"\n",
777+
"Detailed instructions for capturing VTune data and performing analysis is in the \"Introduction to GPU Optimization\" module under the \"Analysis using Intel VTune Profiler\" section."
778+
]
779+
},
753780
{
754781
"cell_type": "markdown",
755782
"id": "2f784dac-56ae-459b-a3c9-675407a6d140",
@@ -768,9 +795,9 @@
768795
],
769796
"metadata": {
770797
"kernelspec": {
771-
"display_name": "Python 3 (Intel® oneAPI 2023.2)",
798+
"display_name": "Python 3 (ipykernel)",
772799
"language": "python",
773-
"name": "c009-intel_distribution_of_python_3_oneapi-beta05-python"
800+
"name": "python3"
774801
},
775802
"language_info": {
776803
"codemirror_mode": {
@@ -782,7 +809,7 @@
782809
"name": "python",
783810
"nbconvert_exporter": "python",
784811
"pygments_lexer": "ipython3",
785-
"version": "3.9.16"
812+
"version": "3.11.5"
786813
}
787814
},
788815
"nbformat": 4,

DirectProgramming/C++SYCL/Jupyter/gpu-optimization-sycl-training/03_Memory_Optimization/032_Memory_Optimization_USM.ipynb

Lines changed: 33 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -140,10 +140,18 @@
140140
"ze_tracer plot showing copy-in overlap with execution of compute kernel\n",
141141
"<img src=\"assets/zetracer_overlap.jpeg\">\n",
142142
"\n",
143-
"Below is vtune command line to capture the `gpu-hotspots` data using a terminal and the resulting captured data can be viewed using Intel VTune Profiler:\n",
143+
"#### Analysis using Intel VTune Profiler\n",
144+
"\n",
145+
"Below is vtune command line to capture the `gpu-hotspots` data using a terminal and the resulting captured data can be viewed using Intel VTune Profiler GUI:\n",
146+
"\n",
147+
"- Modify the code above and then \"Build and Run\" the code in the section above\n",
148+
"- Then in \"Terminal\", go to the current module directory and run the following vtune command (change the `-result-dir` value from `vtune_data` to something that identifies your code) \n",
144149
"```\n",
145-
"vtune -collect gpu-hotspots -result-dir vtune_data $(pwd)/a.out\n",
146-
"```\n"
150+
"vtune -collect gpu-hotspots -result-dir vtune_data $(pwd)/lab/a.out\n",
151+
"```\n",
152+
"- Download the vtune results directory and open using Intel VTune Profiler GUI to do analysis.\n",
153+
"\n",
154+
"Detailed instructions for capturing VTune data and performing analysis is in the \"Introduction to GPU Optimization\" module under the \"Analysis using Intel VTune Profiler\" section.\n"
147155
]
148156
},
149157
{
@@ -243,6 +251,25 @@
243251
"! ./q.sh run_usm_copy_partial.sh"
244252
]
245253
},
254+
{
255+
"cell_type": "markdown",
256+
"id": "0ea5e96a-68b1-42fa-a1ba-ba5a8dfaff24",
257+
"metadata": {},
258+
"source": [
259+
"#### Analysis using Intel VTune Profiler\n",
260+
"\n",
261+
"Below is vtune command line to capture the `gpu-hotspots` data using a terminal and the resulting captured data can be viewed using Intel VTune Profiler GUI:\n",
262+
"\n",
263+
"- Modify the code above and then \"Build and Run\" the code in the section above\n",
264+
"- Then in \"Terminal\", go to the current module directory and run the following vtune command (change the `-result-dir` value from `vtune_data` to something that identifies your code) \n",
265+
"```\n",
266+
"vtune -collect gpu-hotspots -result-dir vtune_data $(pwd)/lab/a.out\n",
267+
"```\n",
268+
"- Download the vtune results directory and open using Intel VTune Profiler GUI to do analysis.\n",
269+
"\n",
270+
"Detailed instructions for capturing VTune data and performing analysis is in the \"Introduction to GPU Optimization\" module under the \"Analysis using Intel VTune Profiler\" section."
271+
]
272+
},
246273
{
247274
"cell_type": "markdown",
248275
"id": "828a2d7b-aa77-48f2-a5f8-64f75c62e4f1",
@@ -340,9 +367,9 @@
340367
],
341368
"metadata": {
342369
"kernelspec": {
343-
"display_name": "Python 3 (Intel® oneAPI 2023.2)",
370+
"display_name": "Python 3 (ipykernel)",
344371
"language": "python",
345-
"name": "c009-intel_distribution_of_python_3_oneapi-beta05-python"
372+
"name": "python3"
346373
},
347374
"language_info": {
348375
"codemirror_mode": {
@@ -354,7 +381,7 @@
354381
"name": "python",
355382
"nbconvert_exporter": "python",
356383
"pygments_lexer": "ipython3",
357-
"version": "3.9.16"
384+
"version": "3.11.5"
358385
}
359386
},
360387
"nbformat": 4,

0 commit comments

Comments
 (0)