2 просмотров
Рейтинг статьи
1 звезда2 звезды3 звезды4 звезды5 звезд
Загрузка...

Cyring / CoreFreq

Содержание

It would be great to see the uncore clocks #31

Copy link Quote reply

travisdowns commented Jul 20, 2017

Much of this tool shows is already included in the turbostat tool included in most distributions (but the UI is much nicer!) — but showing the uncore clock(s) would be something awesome and new.

cyring commented Jul 20, 2017

Memory controller can be queried using the driver option Experimental=1

So far, tested successfully with i7-920 Nehalem QPI, some Core2 and Turion Hypertransport. Other architectures have been blinded programmed base on datasheets. Untested.

Beside UI, those are the reasons why CoreFreq is different to other tools. Its driver aims to provide a framework to query the processor registers, pci and other instructions using a low latency path for accuracy.

travisdowns commented Jul 20, 2017

I think this is in the «uncore» not the «offcore» (where I think the memory controller stuff lives). That is, the clock that the L3 ring is on?

I’m excited also about the driver, currently I’m using libpfc which offers userspace reads of the PMC, but it would be great to have a drive to read thing things for which is there is no user-space access at all.

cyring commented Jul 21, 2017

Here is what I can provide for the Nehalem IMC.

cyring commented Jul 21, 2017

cyring commented Oct 15, 2017

Uncore fixed counter has been implemented for Nehalem architecture.

Slowly in progress for SMB-EP & HSW-EP

Miss alpha testers for SNB, IVB & similar μArch

travisdowns commented Oct 16, 2017

@cyring — I am on SKL, would that be useful testing for you?

cyring commented Oct 17, 2017

Yes, it will be helpful. However, I have committed code for Nehalem only b/c my attempts to enable some msr registers for Xeon Uncore had crashed servers.
So you will have to start the kernel module with the HNM architecture identifier:

insmod corefreqk.ko ArchID=19

Then run dmesg to verify architecture is acknowledged by driver.

Next in corefreq-cli using view «Pkg. cycles», look at the counter UNCORE

travisdowns commented Oct 18, 2017 •

I tried it, but unfortunately immediately upon loading the kernel module with ArchID=19 I got a hard lockup and had to reboot with the power button, so I wasn’t able to run the further tests.

The module loaded fine without the ArchID=19 though.

cyring commented Oct 18, 2017

So SKL Uncore counter does not program like with NHM.
Can you please return me the output of corefreq-cli -s

travisdowns commented Oct 20, 2017

Here’s what I got:

cyring commented Oct 21, 2017 •

Thank you.
I’m programming an algorithm for Skylake architectures (desktop, mobile, xeon)

cyring commented Dec 11, 2017

Hello,
I have programmed the Uncore fixed counter for SandyBridge and superior architectures.
It has been tested OK with a Broadwell [06_3D]
For Skylake, same algorithm but different msr registers : can you give a try, please ?

travisdowns commented Dec 11, 2017

@cyring — do I still need the explicit ArchID=19 specifier? What you would like me to test?

travisdowns commented Dec 11, 2017

Where can I see the core frequency? I only found it on the «dashboard» tab but it is also unreadable to me due to the large ASCII-art letters being used, but anyways it seems like there is a display problem:

Note the numbers in between the fields and what appears to be an «E» following the uncore clock.

In my experience the uncore clock varied between 0000 and 1844 when idle and around 0030 under heavy load, which doesn’t seem right.

cyring commented Dec 11, 2017

Thanks for this quick reply.
Good news is that your processor is doing ok with Uncore readings without crashing -;)
With Broadwell, I have the same issue of large number. It could be an overflow of the Uncore counter. I’m working on this.
In the UI menu, you can also follow the «View» -> «Package cycles» where the Uncore frequency is displayed.

Читать еще:  Служба SSTP что это за процесс

cyring commented Dec 12, 2017

It’s a relative frequency, whereas Nehalem Uncore is constant.
Thus counter delta was negative over the period. I have commit a workaround (absolute difference).

You will need to apply some load in parallel b/c Uncore fixed counter does not count during stalled cycles (such as C-States).
You may also notice a short erratic value when transitioning from Load to Idle : it’s a side effect of my formula.

cyring commented Dec 20, 2017

Hello,
Please let me know if Uncore is showing up with your processor ?

travisdowns commented Dec 20, 2017

@cyring is there a fix for the display issue? I can try again. Right, I also recently read that when the socket is in C1 the uncore doesn’t tick.

cyring commented Dec 21, 2017 •

@travisdowns : I have experimented a Broadwell/mobile processor, the fixed performance counter (FC0) of the Uncore is counting cycles when in C0.
Thus during idle states, from C1 down to the lowest Cx, FC0 does not increment, and the measurement (previous FC0 — current FC0 over 1 second interval) is going down or near zero.
This is confirmed in the Intel SDM specifications.

In short, you will have to put the processor in C0 to read its Uncore frequency.
My way is «sha1sum /dev/zero» in another terminal.

I have commited UI fixes & changes, can you please show me which display issue you have ?

cyring commented Dec 28, 2017

Also tested OK with a IVB i7 3770K

travisdowns commented Dec 28, 2017

@cyring — I tested this on my Skylake using the «package cycles» view and indeed the «uncore» shows some number, but it doesn’t seem correct.

With the system under load (intel_pstate governor set to performance) the value fluctuates between 50,000,000 and 150,000,000. There are no units — is that in Hz? The true uncore frequency should be similar to the CPU frequency of around 3 GHz, at least when loaded down like this.

cyring commented Dec 29, 2017

@travisdowns : issue reopened.
Could you post a photo of the BIOS showing Uncore value ?
To my understanding, intel_pstate max governor is applying a profile; but not load yet. Do you have execute load using any command such as
» taskset -c 0 sha1sum /dev/zero «

travisdowns commented Dec 30, 2017

My BIOS doesn’t show uncore clocks, sorry. You can find plenty of references that indicate that the uncore clocks have the same range as the core clocks, however — e.g., on a 3.5 GHz CPU the max uncore clock is also 3.5 GHz. Under load that isn’t core local (i.e., you should use a load that touches enough memory to hit the L3 at least) I’d expect it to near the maximum almost all the time.

Sorry for the confusion: I was reporting my intel_pstate setting because this is important for various power saving behaviors that greatly affect things like uncore clock rates (i.e., «memory efficient turbo») — but I applied load separately four different ways:

cyring commented Aug 15, 2018

Hello,
They had been additional code for Skylake.
Do you get correct min and max Uncore frequencies ?

travisdowns commented Aug 16, 2018 •

@cyring — can you be specific about what I should check? ./corefreq-cli -s seems to report unchanging max and min for uncore.

On the frequency screen I see a value above UNCORE x26 which varies from 1.00 or so up to 4000, but spends most of its time between 20 and 60. So I’m not sure what is going on or how to interpret the uncore figure.

Принцип работы

Как известно, на современных системных платах рабочая частота процессора определяется при помощи умножения внешней частоты (частоты системной шины FSB) и специального числа, которое называется множителем частоты. Например, если частота системной шины составляет 133 МГц, а величина множителя равна 10, то процессор будет работать на частоте в 1330 МГц. Поэтому при помощи изменения данного параметра можно изменить и рабочую частоту, на которой будет функционировать процессор.

Многие современные материнские платы поддерживают подобную функцию, хотя часто бывает и так, что изменение коэффициента умножения на материнской плате заблокировано. Обычно данный параметр можно как увеличивать по сравнению с номинальным значением, так и уменьшать, хотя иногда, особенно у процессоров производства Intel, возможно лишь частичное изменение множителя, в сторону его уменьшения, но не увеличения по сравнению с номиналом.

Описываемая опция BIOS предоставляет пользователю инструмент, при помощи которого он может установить необходимый множитель. Обычным значением опции является ряд чисел, набор которых зависит от модели ЦП и материнской платы. Например, это могут быть числа вида 2, 2.5, 3, 4, 5.5, 6 и так далее. Также множитель может быть приведен в опции в виде правильной дроби, например, 1:2, 1:5, 2:5, и т.д.

Эту опцию можно встретить далеко не во всех BIOS, а лишь там, где материнская плата позволяет пользователю самостоятельно устанавливать множитель. На тех платах, где данная операция невозможна, опция может носить исключительно информационный характер и показывать заранее определенное значение множителя. Опция также может носить и другие названия, например, CPU Ratio или Multiplier Factor. Обычно она располагается в разделе BIOS, посвященном настройке параметров частот и напряжений материнской платы и процессора (иногда в специальном разделе, посвященном исключительно настройкам процессора). Во многих из тех BIOS, где установка параметра разрешена, часто бывает необходимо предварительно включить саму возможность редактирования частоты при помощи другой опции, например, опции CPU Host Clock Control.

Опция CPU Ratio часто бывает полезной для тех пользователей, которые пытаются увеличить штатную производительность персонального компьютера при помощи разгона. При этом изменение множителя ЦП обычно производится параллельно с установкой частоты системной шины, а иногда и c изменением напряжения ядра процессора. Эти операции осуществляются при помощи других функций BIOS, таких, как CPU Clock и CPU Vcore.



It would be great to see the uncore clocks #31

Copy link Quote reply

travisdowns commented Jul 20, 2017

Much of this tool shows is already included in the turbostat tool included in most distributions (but the UI is much nicer!) — but showing the uncore clock(s) would be something awesome and new.

cyring commented Jul 20, 2017

Memory controller can be queried using the driver option Experimental=1

So far, tested successfully with i7-920 Nehalem QPI, some Core2 and Turion Hypertransport. Other architectures have been blinded programmed base on datasheets. Untested.

Beside UI, those are the reasons why CoreFreq is different to other tools. Its driver aims to provide a framework to query the processor registers, pci and other instructions using a low latency path for accuracy.

travisdowns commented Jul 20, 2017

I think this is in the «uncore» not the «offcore» (where I think the memory controller stuff lives). That is, the clock that the L3 ring is on?

I’m excited also about the driver, currently I’m using libpfc which offers userspace reads of the PMC, but it would be great to have a drive to read thing things for which is there is no user-space access at all.

cyring commented Jul 21, 2017

Here is what I can provide for the Nehalem IMC.

Читать еще:  На компьютере отсутствует Msvcr110 dll что делать

cyring commented Jul 21, 2017

cyring commented Oct 15, 2017

Uncore fixed counter has been implemented for Nehalem architecture.

Slowly in progress for SMB-EP & HSW-EP

Miss alpha testers for SNB, IVB & similar μArch

travisdowns commented Oct 16, 2017

@cyring — I am on SKL, would that be useful testing for you?

cyring commented Oct 17, 2017

Yes, it will be helpful. However, I have committed code for Nehalem only b/c my attempts to enable some msr registers for Xeon Uncore had crashed servers.
So you will have to start the kernel module with the HNM architecture identifier:

insmod corefreqk.ko ArchID=19

Then run dmesg to verify architecture is acknowledged by driver.

Next in corefreq-cli using view «Pkg. cycles», look at the counter UNCORE

travisdowns commented Oct 18, 2017 •

I tried it, but unfortunately immediately upon loading the kernel module with ArchID=19 I got a hard lockup and had to reboot with the power button, so I wasn’t able to run the further tests.

The module loaded fine without the ArchID=19 though.

cyring commented Oct 18, 2017

So SKL Uncore counter does not program like with NHM.
Can you please return me the output of corefreq-cli -s

travisdowns commented Oct 20, 2017

Here’s what I got:

cyring commented Oct 21, 2017 •

Thank you.
I’m programming an algorithm for Skylake architectures (desktop, mobile, xeon)

cyring commented Dec 11, 2017

Hello,
I have programmed the Uncore fixed counter for SandyBridge and superior architectures.
It has been tested OK with a Broadwell [06_3D]
For Skylake, same algorithm but different msr registers : can you give a try, please ?

travisdowns commented Dec 11, 2017

@cyring — do I still need the explicit ArchID=19 specifier? What you would like me to test?

travisdowns commented Dec 11, 2017

Where can I see the core frequency? I only found it on the «dashboard» tab but it is also unreadable to me due to the large ASCII-art letters being used, but anyways it seems like there is a display problem:

Note the numbers in between the fields and what appears to be an «E» following the uncore clock.

In my experience the uncore clock varied between 0000 and 1844 when idle and around 0030 under heavy load, which doesn’t seem right.

cyring commented Dec 11, 2017

Thanks for this quick reply.
Good news is that your processor is doing ok with Uncore readings without crashing -;)
With Broadwell, I have the same issue of large number. It could be an overflow of the Uncore counter. I’m working on this.
In the UI menu, you can also follow the «View» -> «Package cycles» where the Uncore frequency is displayed.

cyring commented Dec 12, 2017

It’s a relative frequency, whereas Nehalem Uncore is constant.
Thus counter delta was negative over the period. I have commit a workaround (absolute difference).

You will need to apply some load in parallel b/c Uncore fixed counter does not count during stalled cycles (such as C-States).
You may also notice a short erratic value when transitioning from Load to Idle : it’s a side effect of my formula.

cyring commented Dec 20, 2017

Hello,
Please let me know if Uncore is showing up with your processor ?

travisdowns commented Dec 20, 2017

@cyring is there a fix for the display issue? I can try again. Right, I also recently read that when the socket is in C1 the uncore doesn’t tick.

cyring commented Dec 21, 2017 •

@travisdowns : I have experimented a Broadwell/mobile processor, the fixed performance counter (FC0) of the Uncore is counting cycles when in C0.
Thus during idle states, from C1 down to the lowest Cx, FC0 does not increment, and the measurement (previous FC0 — current FC0 over 1 second interval) is going down or near zero.
This is confirmed in the Intel SDM specifications.

In short, you will have to put the processor in C0 to read its Uncore frequency.
My way is «sha1sum /dev/zero» in another terminal.

I have commited UI fixes & changes, can you please show me which display issue you have ?

cyring commented Dec 28, 2017

Also tested OK with a IVB i7 3770K

travisdowns commented Dec 28, 2017

@cyring — I tested this on my Skylake using the «package cycles» view and indeed the «uncore» shows some number, but it doesn’t seem correct.

With the system under load (intel_pstate governor set to performance) the value fluctuates between 50,000,000 and 150,000,000. There are no units — is that in Hz? The true uncore frequency should be similar to the CPU frequency of around 3 GHz, at least when loaded down like this.

cyring commented Dec 29, 2017

@travisdowns : issue reopened.
Could you post a photo of the BIOS showing Uncore value ?
To my understanding, intel_pstate max governor is applying a profile; but not load yet. Do you have execute load using any command such as
» taskset -c 0 sha1sum /dev/zero «

travisdowns commented Dec 30, 2017

My BIOS doesn’t show uncore clocks, sorry. You can find plenty of references that indicate that the uncore clocks have the same range as the core clocks, however — e.g., on a 3.5 GHz CPU the max uncore clock is also 3.5 GHz. Under load that isn’t core local (i.e., you should use a load that touches enough memory to hit the L3 at least) I’d expect it to near the maximum almost all the time.

Sorry for the confusion: I was reporting my intel_pstate setting because this is important for various power saving behaviors that greatly affect things like uncore clock rates (i.e., «memory efficient turbo») — but I applied load separately four different ways:

cyring commented Aug 15, 2018

Hello,
They had been additional code for Skylake.
Do you get correct min and max Uncore frequencies ?

travisdowns commented Aug 16, 2018 •

@cyring — can you be specific about what I should check? ./corefreq-cli -s seems to report unchanging max and min for uncore.

On the frequency screen I see a value above UNCORE x26 which varies from 1.00 or so up to 4000, but spends most of its time between 20 and 60. So I’m not sure what is going on or how to interpret the uncore figure.

Какое значение опции выбрать?

Если вы не собираетесь заниматься разгоном центрального процессора, то лучше всего оставить значение множителя, принятое в BIOS по умолчанию. Поскольку увеличение данного параметра повысит рабочую частоту процессора, то, как следствие, возрастет и его производительность. Однако при этом стоит считаться и с возможными негативными последствиями разгона – нестабильной работой компьютера, а также чрезвычайно сильным нагревом процессора, требующим принятия дополнительных мер по его охлаждению.
Порекомендуйте Друзьям статью:

It would be great to see the uncore clocks #31

Copy link Quote reply

travisdowns commented Jul 20, 2017

Much of this tool shows is already included in the turbostat tool included in most distributions (but the UI is much nicer!) — but showing the uncore clock(s) would be something awesome and new.

cyring commented Jul 20, 2017

Memory controller can be queried using the driver option Experimental=1

So far, tested successfully with i7-920 Nehalem QPI, some Core2 and Turion Hypertransport. Other architectures have been blinded programmed base on datasheets. Untested.

Читать еще:  Кэшированные данные что это можно ли удалить

Beside UI, those are the reasons why CoreFreq is different to other tools. Its driver aims to provide a framework to query the processor registers, pci and other instructions using a low latency path for accuracy.

travisdowns commented Jul 20, 2017

I think this is in the «uncore» not the «offcore» (where I think the memory controller stuff lives). That is, the clock that the L3 ring is on?

I’m excited also about the driver, currently I’m using libpfc which offers userspace reads of the PMC, but it would be great to have a drive to read thing things for which is there is no user-space access at all.

cyring commented Jul 21, 2017

Here is what I can provide for the Nehalem IMC.

cyring commented Jul 21, 2017

cyring commented Oct 15, 2017

Uncore fixed counter has been implemented for Nehalem architecture.

Slowly in progress for SMB-EP & HSW-EP

Miss alpha testers for SNB, IVB & similar μArch

travisdowns commented Oct 16, 2017

@cyring — I am on SKL, would that be useful testing for you?

cyring commented Oct 17, 2017

Yes, it will be helpful. However, I have committed code for Nehalem only b/c my attempts to enable some msr registers for Xeon Uncore had crashed servers.
So you will have to start the kernel module with the HNM architecture identifier:

insmod corefreqk.ko ArchID=19

Then run dmesg to verify architecture is acknowledged by driver.

Next in corefreq-cli using view «Pkg. cycles», look at the counter UNCORE

travisdowns commented Oct 18, 2017 •

I tried it, but unfortunately immediately upon loading the kernel module with ArchID=19 I got a hard lockup and had to reboot with the power button, so I wasn’t able to run the further tests.

The module loaded fine without the ArchID=19 though.

cyring commented Oct 18, 2017

So SKL Uncore counter does not program like with NHM.
Can you please return me the output of corefreq-cli -s

travisdowns commented Oct 20, 2017

Here’s what I got:

cyring commented Oct 21, 2017 •

Thank you.
I’m programming an algorithm for Skylake architectures (desktop, mobile, xeon)

cyring commented Dec 11, 2017

Hello,
I have programmed the Uncore fixed counter for SandyBridge and superior architectures.
It has been tested OK with a Broadwell [06_3D]
For Skylake, same algorithm but different msr registers : can you give a try, please ?

travisdowns commented Dec 11, 2017

@cyring — do I still need the explicit ArchID=19 specifier? What you would like me to test?

travisdowns commented Dec 11, 2017

Where can I see the core frequency? I only found it on the «dashboard» tab but it is also unreadable to me due to the large ASCII-art letters being used, but anyways it seems like there is a display problem:

Note the numbers in between the fields and what appears to be an «E» following the uncore clock.

In my experience the uncore clock varied between 0000 and 1844 when idle and around 0030 under heavy load, which doesn’t seem right.

cyring commented Dec 11, 2017

Thanks for this quick reply.
Good news is that your processor is doing ok with Uncore readings without crashing -;)
With Broadwell, I have the same issue of large number. It could be an overflow of the Uncore counter. I’m working on this.
In the UI menu, you can also follow the «View» -> «Package cycles» where the Uncore frequency is displayed.

cyring commented Dec 12, 2017

It’s a relative frequency, whereas Nehalem Uncore is constant.
Thus counter delta was negative over the period. I have commit a workaround (absolute difference).

You will need to apply some load in parallel b/c Uncore fixed counter does not count during stalled cycles (such as C-States).
You may also notice a short erratic value when transitioning from Load to Idle : it’s a side effect of my formula.

cyring commented Dec 20, 2017

Hello,
Please let me know if Uncore is showing up with your processor ?

travisdowns commented Dec 20, 2017

@cyring is there a fix for the display issue? I can try again. Right, I also recently read that when the socket is in C1 the uncore doesn’t tick.

cyring commented Dec 21, 2017 •

@travisdowns : I have experimented a Broadwell/mobile processor, the fixed performance counter (FC0) of the Uncore is counting cycles when in C0.
Thus during idle states, from C1 down to the lowest Cx, FC0 does not increment, and the measurement (previous FC0 — current FC0 over 1 second interval) is going down or near zero.
This is confirmed in the Intel SDM specifications.

In short, you will have to put the processor in C0 to read its Uncore frequency.
My way is «sha1sum /dev/zero» in another terminal.

I have commited UI fixes & changes, can you please show me which display issue you have ?

cyring commented Dec 28, 2017

Also tested OK with a IVB i7 3770K

travisdowns commented Dec 28, 2017

@cyring — I tested this on my Skylake using the «package cycles» view and indeed the «uncore» shows some number, but it doesn’t seem correct.

With the system under load (intel_pstate governor set to performance) the value fluctuates between 50,000,000 and 150,000,000. There are no units — is that in Hz? The true uncore frequency should be similar to the CPU frequency of around 3 GHz, at least when loaded down like this.

cyring commented Dec 29, 2017

@travisdowns : issue reopened.
Could you post a photo of the BIOS showing Uncore value ?
To my understanding, intel_pstate max governor is applying a profile; but not load yet. Do you have execute load using any command such as
» taskset -c 0 sha1sum /dev/zero «

travisdowns commented Dec 30, 2017

My BIOS doesn’t show uncore clocks, sorry. You can find plenty of references that indicate that the uncore clocks have the same range as the core clocks, however — e.g., on a 3.5 GHz CPU the max uncore clock is also 3.5 GHz. Under load that isn’t core local (i.e., you should use a load that touches enough memory to hit the L3 at least) I’d expect it to near the maximum almost all the time.

Sorry for the confusion: I was reporting my intel_pstate setting because this is important for various power saving behaviors that greatly affect things like uncore clock rates (i.e., «memory efficient turbo») — but I applied load separately four different ways:

cyring commented Aug 15, 2018

Hello,
They had been additional code for Skylake.
Do you get correct min and max Uncore frequencies ?

travisdowns commented Aug 16, 2018 •

@cyring — can you be specific about what I should check? ./corefreq-cli -s seems to report unchanging max and min for uncore.

On the frequency screen I see a value above UNCORE x26 which varies from 1.00 or so up to 4000, but spends most of its time between 20 and 60. So I’m not sure what is going on or how to interpret the uncore figure.

Ссылка на основную публикацию
Статьи c упоминанием слов:
Adblock
detector