Prepared for templating use
Some checks failed
Build / build (push) Failing after 4s

This commit is contained in:
Marius Drechsler 2025-02-17 16:36:45 +01:00
parent 80a1eca011
commit f2bdebc7dc
Signed by: marius
GPG key ID: 56D4131BA3104777
144 changed files with 22 additions and 41387 deletions

View file

@ -1,439 +0,0 @@
#import "@preview/glossarium:0.4.1": *
#import "@preview/tablex:0.0.8": tablex, rowspanx, colspanx
= Boundary Adaptive Clustering with Helper Data (BACH)
//Instead of generating helper-data to improve the quantization process itself, like in #gls("smhdt"), or using some kind of error correcting code after the quantization process, we can also try to find helper-data before performing the quantization that will optimize our input values before quantizing them to minimize the risk of bit and symbol errors during the reconstruction phase.
We can explore the option of finding helper data before performing the quantization process.
This approach aims to optimize our input values prior to quantization, which may help minimize the risk of bit and symbol errors during the reconstruction phase.
This differs from methods like @smhdt, which generate helper data to improve the quantization process itself, of those that apply error-correcting codes afterward.
Since this #gls("hda") modifies the input values before the quantization takes place, we will consider the input values as zero-mean Gaussian distributed and not use a CDF to transform these values into the tilde-domain.
== Optimizing single-bit sign-based quantization<sect:1-bit-opt>
Before we take a look at the higher order quantization cases, we will start with a very basic method of quantization: a quantizer, that only returns a symbol with a width of $1$ bit and uses the sign of the input value to determine the resulting bit symbol.
#figure(
include("./../graphics/quantizers/bach/sign-based-overlay.typ"),
caption: [1-bit quantizer with the PDF of a normal distribution]
)<fig:1-bit_normal>
If we overlay the PDF of a zero-mean Gaussian distributed variable $X$ with a sign-based quantizer function as shown in @fig:1-bit_normal, we can see that the expected value of the Gaussian distribution overlaps with the decision threshold of the sign-based quantizer.
Considering that the margin of error of the value $x$ is comparable with the one shown in @fig:tmhd_example_enroll, we can conclude that values of $X$ that reside near $0$ are to be considered more unreliable than values that are further away from the x-value 0.
This means that the quantizer used here is very unreliable as is.
Now, to increase the reliability of this quantizer, we can try to move our input values further away from the value $x = 0$.
To do so, we can define a new input value $z$ as a linear combination of two realizations of $X$, $x_1$ and $x_2$ with a set of weights $h_1$ and $h_2$ that we will use as helper data:
$
z = h_1 dot x_1 + h_2 dot x_2 ,
$<eq:lin_combs>
with $h_i in {plus.minus 1}$. Building only the sum of two input values $x_1 + x_2$ is not sufficient here, since the resulting distribution would be a normal distribution with $mu = 0$ as well.
=== Derivation of the resulting distribution
To find a description for the random distribution $Z$ of $z$ we can interpret this process mathematically as a maximisation of a sum.
This can be realized by replacing the values of $x_i$ with their absolute values as this always gives us the maximum value of the sum:
$
z = abs(x_1) + abs(x_2)
$
Taking into account that $x_i$ are realizations of a normal distribution, we can assume without loss of generality that $X$ is i.i.d., /*to have its expected value at $x=0$ and a standard deviation of $sigma = 1$ --*/ defining the overall resulting random distribution $Z$ as:
$
Z = abs(X_1) + abs(X_2).
$<eq:z_distribution>
We will redefine $abs(X)$ as a half-normal distribution $Y$ whose PDF is
$
f_Y(y, sigma) &= frac(sqrt(2), sigma sqrt(pi)) lr(exp(-frac(y^2, 2 sigma^2)) mid(|))_(sigma = 1), y >= 0 \
&= sqrt(frac(2, pi)) exp(- frac(y^2, sigma^2)) .
$<eq:half_normal>
Now, $Z$ simplifies to
$
Z = Y_1 + Y_2.
$
We can assume for now that the realizations of $Y$ are independent of each other.
The PDF of the addition of these two distributions can be described through the convolution of their respective PDFs:
$
f_Z(z) &= integral_0^z f_Y (y) f_Y (z-y) \dy\
&= integral_0^z [sqrt(2/pi) exp(-frac(y^2,2)) sqrt(2/pi) exp(-frac((z-y)^2, 2))] \dy\
&= 2/pi integral_0^z exp(- frac(y^2 + (z-y)^2, 2)) \dy #<eq:z_integral>
$
Evaluating the integral of @eq:z_integral, we can now describe the resulting distribution of this maximisation process analytically:
$
f_Z = 2/sqrt(pi) exp(-frac(z^2, 4)) "erf"(z/2) z >= 0.
$<eq:z_result>
Our derivation of $f_Z$ currently only accounts for the addition of positive values of $x_i$, but two negative $x_i$ values would also return the maximal distance to the coordinate origin.
The derivation for the corresponding PDF is identical, except that the half-normal distribution @eq:half_normal is mirrored around the y-axis.
Because the resulting PDF $f_Z^"neg"$ is a mirrored variant of $f_Z$ and $f_Z$ is arranged symmetrically around the origin, we can define a new PDF $f_Z^*$ as
$
f_Z^* (z) = abs(f_Z (z)),
$
on the entire z-axis.
$f_Z^* (z)$ now describes the final random distribution after the application of our optimization of the input values $x_i$.
#figure(
include("../graphics/plots/z_distribution.typ"),
caption: [Optimized input values $z$ overlaid with sign-based quantizer $cal(Q)$]
)<fig:z_pdf>
@fig:z_pdf shows two key properties of this optimization:
1. Adjusting the input values using the method described above does not require any adjustment of the decision threshold of the sign-based quantizer.
2. The resulting PDF is zero at $z = 0$ leaving no input value for the sign-based quantizer at its decision threshold.
=== Generating helper-data
To find the optimal set of helper-data that will result in the distribution shown in @fig:z_pdf, we can define the vector of all possible linear combinations $bold(z)$ as the vector-matrix multiplication of the input values $x_i$ and the matrix $bold(H)$ of all weight combinations with $h_i in [plus.minus 1]$:
$
bold(z) &= bold(x) dot bold(H)\
$<eq:z_combinations>
We will choose the optimal weights based on the highest absolute value of $bold(z)$, as that value will be the furthest away from $0$.
//We may encounter two entries in $bold(z)$ that both have the same highest absolute value.
//In that case, we will choose the combination of weights randomly out of our possible options.
To not encounter two entries in $bold(z)$ that both have the same highest absolute value, we can set the first helper data bit to be always $h_1 = 1$.
Considering our single-bit quantization case, @eq:z_combinations can be written as:
$
bold(z) = vec(x_1, x_2) dot mat(delim: "[", +1, -1, +1, -1; +1, +1, -1, -1)
$
The vector of optimal weights $bold(h_"opt")$ can now be found through $op("argmax")_h (bold(z))$.
If we take a look at the dimensionality of the matrix of all weight combinations, we notice that we will need to store only $1$ helper-data bit per quantized symbol because $h_1$ is set to $1$.
In fact, we will show later, that the amount of helper-data bits used by this HDA is directly linked to the number of input values used instead of the number of bits we want to extract during quantization.
== Generalization to higher-order bit quantization
We can generalize the idea of @sect:1-bit-opt and apply it for a higher-order bit quantization.
Contrary to @smhdt, we will always use the same step function as quantizer and optimize the input values $x$ to be the furthest away from any decision threshold.
In this higher-order case, this means that we want to optimise our input values as far away as possible from the nearest decision threshold of the quantizer instead of just maximising the absolute value of the linear combination.
For a complete generalization of this method, we will also parametrize the amount of addends $N$ kin the linear combination of $z$.
That means we can define $z$ from now on as:
$
z = sum_(i=1)^(N) x_i dot h_i
$<eq:z_eq>
We can define the condition to test whereas a tested linear combination is optimal as follows:\
The optimal linear combination $z_"opt"$ is found, when the distance to the nearest quantizer decision bound is maximised.
Finding the weights $bold(h)_"opt"$ of the optimal linear combination $z_"opt"$ can be formalized as:
$
bold(h)_"opt" = op("argmax")_h op("min")_j abs(bold(h)^T bold(x) - b_j) "s.t." h_j in {plus.minus 1}
$<eq:optimization>
==== Example with 2-bit quantizer
//Let's consider the following example using a 2-bit quantizer:\
We can define the bounds of the two bit quantizer $bold(b)$ as $[-alpha, 0, alpha]$ omitting the bounds $plus.minus infinity$.
The values of $bold(b)$ are already placed in the real domain to directly quantize normal distributed input values.
A simple way to solve @eq:optimization is to use a brute force method and calculate all distances to every quantization bound $b_j$, because the number of possible combinations is finite.
Furthermore, fining a solution for @eq:optimization analytically poses to be significantly more complex.
The linear combination $z$ for the amount of addends $i = 2$ is defined as
$
z = x_1 dot h_1 plus x_2 dot h_2
$<eq:bach_z_example>
According to @eq:z_combinations, all possible linear combinations for two input values $x_1 "and" x_2$ of @eq:bach_z_example can be collected as the vector $bold(z)$ of length $2^i |_(i=2) =4$:
$
bold(z) = vec(z_1\, z_2\, z_3\, z_4)
$
Calculating the absolute distances to every quantizer bound $b_i$ for all linear combinations $z_i$ gives us the following distance matrix:
$
bold(cal(A)) = mat(
a_(1,1), a_(2,1), a_(3,1), a_(4,1);
a_(1,2), a_(2,2), a_(3,1), a_(4,2);
a_(1,3), a_(2,3), a_(3,1), a_(4,3);
),
$<mat:distance_A>
where $a_"i,j" = abs(z_i - b_j)$.
Now we want to find the bound $b_i$ for every $z_i$ to which it is closest.
This can be achieved by determining the minimum value for each column of the matrix $bold(cal(A))$.
The resulting vector $bold(nu)$ now consists of the distance to the nearest quantizer bound for every linear combination with entries defined as:
$
nu_"j" = min{a_"i,j" | 1 <= j <= 4} "for each" i = 1, 2, 3.
$
The optimal linear combination $z_"opt"$ can now be found as the entry $z_j$ of $bold(z)$ where its corresponding distance $nu_j$ is maximised.
=== Simulation of the bound distance maximisation strategy<sect:instability_sim>
Two important points were anticipated in the preceding example:
1. We cannot define the resulting random distribution $Z$ after performing this operation analytically and thus also not the quantizer bounds $bold(b)$.
A way to account for that is to guess the resulting random distribution and $bold(b)$ initially and repeating the optimization using quantizer bounds found through the @ecdf of the resulting linear combination values.
2. If the optimization described above is repeated multiple times using an @ecdf, the resulting random distribution $Z$ must converge to a stable random distribution. Otherwise we will not be able to carry out a reliable quantization in which the symbols are uniformly distributed.
To check that the strategy for optimizing the linear combination provided in the example above results in a converging random distribution, we will perform a simulation of the optimization as described in the example using $100 space.nobreak 000$ simulated normal distributed values as realizations of the standard normal distribution with the parameters $mu = 0$ and $sigma = 1$.
@fig:bach_instability shows various histograms of the vector $bold(z)_"opt"$ after different iterations.
Even though the overall shape of the distribution comes close to our goal of moving the input values away from the quantizer bounds $bold(b)$, the distribution itself does not converge to one specific, final shape.
It seems that the resulting distributions for each iteration oscillate in some way, since the distributions for iterations $7$ and $25$ have the same shape.
However the distribution seems to be chaotic and thus does not seem suitable for further quantization.
#figure(
grid(
columns: (1fr, 1fr),
rows: (2),
[//#figure(
#image("../graphics/plots/bach/instability/frame_1.png")
#v(-2em)
//)
Iteration 1],
[//#figure(
#image("../graphics/plots/bach/instability/frame_7.png")
#v(-2em)
//)
Iteration 7],
[//#figure(
#image("../graphics/plots/bach/instability/frame_18.png")
#v(-2em)
//)
Iteration 18],
[//#figure(
#image("../graphics/plots/bach/instability/frame_25.png")
#v(-2em)
//)
Iteration 25]
),
caption: [Probability distributions for various iterations]
)<fig:bach_instability>
=== Center Point Approximation
For that reason, we will now propose a different strategy to find the weights for the optimal linear combination $z_"opt"$.
Instead of defining the desired outcome of $z_"opt"$ as the greatest distance to the nearest quantizer decision threshold, we will define a vector $bold(cal(o)) = [cal(o)_1, cal(o)_2 ..., cal(o)_(2^M)]$ containing the optimal values that we want to approximate with $z$.
Considering a M-bit quantizer with $2^M$ steps, we can define the values of $bold(cal(o))$ as the center points of these quantizer steps.
Its cardinality is $2^M$.
It has to be noted, that $bold(cal(o))$ consists of optimal values that we may not be able to exactly approximate using a linear combination based on weights and our given input values.
We can find the optimal linear combination $z_"opt"$ by finding the minimum of all distances to all optimal points defined in $bold(cal(o))$.
The matrix that contains the distances of all linear combinations $bold(z)$ to all optimal points $bold(cal(o))$ is defined as: $bold(cal(A))$ with its entries $a_"ij" = abs(z_"i" - o_"j")$.\
$z_"opt"$ can now be defined as the minimal value in $bold(cal(A))$:
$
z_"opt" = op("min")(bold(cal(A)))
= op("min")(mat(delim: "[", a_("00"), ..., a_("i0"); dots.v, dots.down, " "; a_"0j", " ", a_"ij" )).
$
#figure(
kind: "algorithm",
supplement: [Algorithm],
include("../pseudocode/bach_find_best_appr.typ")
)<alg:best_appr>
@alg:best_appr shows a programmatic approach to find the set of weights for the best approximation. The algorithm returns a tuple consisting of the weight combination $bold(h)$ and the resulting value of the linear combination $z_"opt"$.
Because the superposition of different linear combinations of normal distributions corresponds to a Gaussian Mixture Model, finding the ideal set of points $bold(cal(o))$ analytically is impossible.
Instead, we will first estimate $bold(cal(o))$ based on the normal distribution parameters after performing multiple convolutions with the input distribution $X$.
The parameters of a multiple convoluted normal distribution is defined as:
$
sum_(i=1)^(n) cal(N)(mu_i, sigma_i^2) tilde cal(N)(sum_(i=1)^n mu_i, sum_(i=1)^n sigma_i^2),
$
while $n$ defines the number of convolutions performed @schmutz.
With this definition, we can define the parameters of the probability distribution $Z$ of the linear combinations $z$ based on the parameters of $X$, $mu_X$ and $sigma_X$:
$
Z(mu_Z, sigma_Z^2) = Z(sum_(i=1^n) mu_X, sum_(i=1)^n sigma_X^2)
$<eq:z_dist_def>
The parameters $mu_Z$ and $sigma_Z$ allow us to apply an inverse CDF on a multi-bit quantizer $cal(Q)(2, tilde(x))$ defined in the tilde-domain.
Our initial values for $bold(cal(o))_"first"$ can now be defined as the centers of the steps of the transformed quantizer function $cal(Q)(2, x)$.
These points can be found easily but for the outermost center points whose quantizer steps have a bound $plus.minus infinity$.\
However, we can still find these two remaining center points by artificially defining the outermost bounds of the quantizer as $frac(1, 2^(2 dot M))$ and $frac((2^(2 dot M))-1, 2^(2 dot M))$ in the tilde-domain and also apply the inverse CDF to them.
#scale(x: 90%, y: 90%)[
#figure(
include("../graphics/quantizers/two-bit-enroll-real.typ"),
caption: [Quantizer for the distribution resulting a triple convolution with distribution parameters $mu_X=0$ and $sigma_X=1$ with marked center points of the quantizer steps]
)<fig:two-bit-enroll-find-centers>]
We can now use an iterative algorithm that alternates between optimizing the quantizing bounds of $cal(Q)$ and our vector of optimal points $bold(cal(o))_"first"$.
#figure(
kind: "algorithm",
supplement: [Algorithm],
include("../pseudocode/bach_1.typ")
)<alg:bach_1>
We can see both of these alternating parts in @alg:bach_1_2[Lines] and @alg:bach_1_3[] of @alg:bach_1.
To optimize the quantizing bounds of $cal(Q)$, we will sort the values of all the resulting linear combinations $bold(z)_"opt"$ in ascending order.
Using the inverse @ecdf defined in @eq:ecdf_inverse, we can find new quantizer bounds based on $bold(z)_"opt"$ from the first iteration.
These bounds will then be used to define a new set of optimal points $bold(cal(o))$ used for the next iteration.
During every iteration of @alg:bach_1, we will store all weights $bold(h)$ used to generate the vector for optimal linear combinations $bold(z)_"opt"$.
We can also use a simulation here to check the convergence of the distribution $Z$ using the same input values and quantizer configurations as in @sect:instability_sim.
#figure(
grid(
columns: (2),
[#figure(
image("./../graphics/plots/bach/stability/frame_1.png"),
//caption: [Iteration 1]
)
#v(-2em)
Iteration 1],
[#figure(
image("./../graphics/plots/bach/stability/frame_25.png")
)
#v(-2em)
Iteration 25],
),
caption: [Probability distributions for the first and 25th iteration of the center point approximation method]
)<fig:bach_stability>
Comparing the distributions in @fig:bach_stability, we can see that besides a closer arrangement the overall shape of the probability distribution $Z$ converges to a stable distribution representing the original estimated distribution $Z$ through @eq:z_dist_def through smaller normal distributions.
The output of @alg:bach_1 is the vector of optimal weights $bold(h)_"opt"$.
$bold(h)_"opt"$ can now be used to complete the enrollment phase and quantize the values $bold(z)_"opt"$.
To perform reconstruction, we can calculate the same linear combination used during enrollment with the generated helper-data and the new PUF readout measurements.
We can lower the computational complexity of this approach by using the assumption that $X$ are i.i.d..
The end result of $bold(cal(o))$ can be calculated once for a specific device series and saved in the ROM of.
During enrollment, only the vector $bold(h)_"opt"$ has to be calculated.
=== Helper-data size and amount of addends
The amount of helper data is directly linked to the symbol bit width $M$ and the amount of addends $N$ used in the linear combination.
Because we can set the first helper data bit $h_1$ of a linear combination to $1$ to omit the random choice, the resulting extracted bit to helper data bit ratio $cal(r)$ can be defined as $cal(r) = frac(M, N-1)$, whose equation is similar tot he one we used in the @smhdt analysis.
== Experiments
To test our implementation of @bach using the prior introduced center point approximation we conducted a similar experiment as in @sect:smhd_experiments.
However, we have omitted the analysis over different temperatures for the enrollment and reconstruction phase here, as the behaviour of @bach corresponds to that of @smhdt in this matter.
As in the S-Metric analysis, the resulting dataset consists of the bit error rates of various configurations with quantization symbol widths of up to $4$ bits evaluated with up to $10$ addends for the linear combinations.
== Results & Discussion
We can now compare the #glspl("ber") of different @bach configurations.
/*#figure(
table(
columns: (9),
align: center + horizon,
inset: 7pt,
[*BER*],[N=2],[N=3],[N=4],[N=5], [N=6], [$N=7$], [$N=8$], [$N=9$],
[$M=1$], [$0.09$], [$0.09$], [$0.012$], [$0.018$], [$0.044$], [$0.05$], [$0.06$], [$0.07$],
[$M=2$], [$0.03$], [$0.05$], [$0.02$], [$0.078$], [$0.107$], [$0.114$], [$0.143$], [$0.138$],
[$M=3$], [$0.07$], [$0.114$], [$0.05$], [$0.15$], [$0.2$], [$0.26$], [$0.26$], [$0.31$],
[$M=4$], [$0.13$], [$0.09$], [$0.18$], [$0.22$], [$0.26$], [$0.31$], [$0.32$],[$0.35$]
),
caption: [#glspl("ber") of different @bach configurations]
)<tab:BACH_performance>*/
#figure(
kind: table,
tablex(
columns: 9,
align: center + horizon,
inset: 7pt,
// Color code the table like a heat map
map-cells: cell => {
if cell.x > 0 and cell.y > 0 {
cell.content = {
let value = float(cell.content.text)
let text-color = if value >= 0.3 {
red.lighten(15%)
} else if value >= 0.2 {
red.lighten(30%)
} else if value >= 0.15 {
orange.darken(10%)
} else if value >= 0.1 {
yellow.darken(13%) } else if value >= 0.08 {
yellow
} else if value >= 0.06 {
olive
} else if value >= 0.04 {
green.lighten(10%)
} else if value >= 0.02 {
green
} else {
green.darken(10%)
}
cell.fill = text-color
strong(cell.content)
}
}
cell
},
[*BER*],[N=2],[N=3],[N=4],[N=5], [N=6], [$N=7$], [$N=8$], [$N=9$],
[$M=1$], [0.01], [0.01], [0.012], [0.018], [0.044], [0.05], [0.06], [0.07],
[$M=2$], [0.03], [0.05], [0.02], [0.078], [0.107], [0.114], [0.143], [0.138],
[$M=3$], [0.07], [0.114], [0.05], [0.15], [0.2], [0.26], [0.26], [0.31],
[$M=4$], [0.13], [0.09], [0.18], [0.22], [0.26], [0.31], [0.32],[0.35],
[$M=5$], [0.29], [0.21], [0.37], [0.31], [0.23], [0.23], [0.19], [0.15],
[$M=6$], [0.15], [0.33], [0.15], [0.25], [0.21], [0.23], [0.19], [0.14]
),
caption: [#glspl("ber") of different @bach configurations]
)<tab:BACH_performance>
@tab:BACH_performance shows the #glspl("ber") of @bach configurations with $N$ addends and extracting $M$ bits out of one input value $z$.
The first interesting property we can observe, is the caveat @bach produces for the first three bit combinations $M = 1, 2 "and" 3$ at around $N = 3$ and $N = 4$.
At these points, the @ber experiences a drop followed by a steady rise again for higher numbers of $N$.
//This observation could be explained through the fact that the higher $N$ is chosen, the shorter the resulting key, since $N$ divides out values available for quantization by $N$.
If $M$ is generally chosen higher, @bach seems to return unstable results, halving the @ber as $N$ reaches $9$ for $M=5$ but showing no real improvement for various addends if $M=6$.
We can also compare the performance of @bach using the center point approximation approach with the #glspl("ber") of higher order bit quantizations that don't use any helper data.
#figure(
table(
columns: 7,
[*M*], [$1$], [$2$], [$3$], [$4$], [$5$], [$6$],
[*BER*], [$0.013$], [$0.02$], [$0.04$], [$0.07$], [$0.11$], [$0.16$]
),
caption: [#glspl("ber") for higher order bit quantization without helper data ]
)<tab:no_hd>
Unfortunately, the comparison of #glspl("ber") of @tab:no_hd[Tables] and @tab:BACH_performance[] shows that our current realization of @bach either ties the @ber in @tab:no_hd or is worse.
Let's find out why this happens.
==== Discussion
If we take a step back and look at the performance of the optimized single-bit sign-based quantization process of @sect:1-bit-opt, we can compare the following #glspl("ber"):
#figure(
table(
columns: 2,
[*No helper data*], [$0.013$],
[*With helper data using greatest distance*],[$0.00052$],
[*With helper data using center point approximation*], [$0.01$]
),
caption: [Comparison of #glspl("ber") for the single-bit quantization process with and without helper data]
)<tab:comparison_justification>
As we can see in @tab:comparison_justification, generating the helper data based on the original idea where @eq:optimization is used improves the @ber of the single-bit quantization by approx. $96%$.
The probability distributions $Z$ of the two different realizations of @bach -- namely the distance maximization strategy and the center point approximation -- give an indication of this discrepancy:
#figure(
grid(
columns: (2),
[#figure(
image("../graphics/plots/bach/compare/bad.png")
)
#v(-2em)
Center point approximation],
[#figure(
image("../graphics/plots/bach/compare/good.png")
)
#v(-2em)
Distance maximization],
),
caption: [Comparison of the histograms of the different strategies to obtain the optimal weights for the single-bit case]
)<fig:compar_2_bach>
@fig:compar_2_bach shows the two different probability distributions.
We can observe that using a vector of optimal points $bold(cal(o))$ results in a more narrow distribution for $Z$ than just maximizing the linear combination to be as far away from $x=0$ as possible.
This difference in the shape of both distributions seem to be the main contributor to the fact that the optimization using center point approximation yields no improvement for the quantization process.
Unfortunately, we were not able define an algorithm translating this idea to a higher order bit quantization for which the resulting probability distribution $Z$ converges.
Taking a look at the unstable probability distributions issued by the bound distance maximization strategy in @fig:bach_instability, we can get an idea of what kind of distribution a @bach algorithm should achieve.
While the inner parts of the distributions do not overlap with each other like in the stable iterations shown in @fig:bach_stability, the outermost values of these distributions resemble the shape of what we achieved using the distance maximization for a single-bit optimization.
These two properties could -- if the distribution converges -- result in far better #glspl("ber") for higher order bit quantization, as the comparison in @tab:comparison_justification indicates.

View file

@ -1,485 +0,0 @@
#import "@preview/drafting:0.2.0": *
#import "@preview/glossarium:0.4.1": *
= S-Metric Helper Data Method <chap:smhd>
A metric based @hda generates helper data at PUF enrollment to provide more reliable results at the reconstruction stage.
Each of these metrics correspond to a quantizer with different bounds to lower the risk of bit or symbol errors during reconstruction.
For this kind of @hda, the generated metric is used as helper data and thus does not have to be kept secret.
== Background
Before we turn to a concrete realization of the S-Metric method, let's take a look at its predecessor, the Two-Metric Helper Data Method.
/*=== Distribution Independency <sect:dist_independency>
The publications for the Two-Metric approach @tmhd1 and @tmhd2, as well as the generalized S-Metric approach @smhd make the assumption, that the PUF readout is zero-mean Gaussian distributed @smhd.
We propose, that a Gaussian distributed input for S-Metric quantization is not required for the operation of this quantizing algorithm.
Instead, any distribution can be used for input values given, that a CDF exists for that distribution and its parameters are known.
As already mentioned in @tilde-domain, this transformation will result in uniformly distributed values, where equi-probable areas in the real domain correspond to equi-distant areas in the Tilde-Domain.
Contrary to @tmhd1, @tmhd2 and @smhd, which display relevant areas as equi-probable in a normal distribution, we will use equi-distant areas in a uniform distribution for better understandability.
It has to be mentioned, that instead of transforming all values of the PUF readout into the Tilde-Domain, we could also use an inverse CDF to transform the bounds of our evenly spaced areas into the real domain with (normal) distributed values, which can be assessed as remarkably less computationally complex.#margin-note[Das erst später]
*/
=== Two-Metric Helper Data Method <sect:tmhd>
The simplest form of a metric-based @hda is the Two-Metric Helper Data Method.
Its quantization only yields symbols of 1-bit width and it only uses a single bit of helper data to store the choice of metric.
@fig:tmhd_example_enroll and @fig:tmhd_example_reconstruct illustrate an example enrollment and reconstruction process.
Consider the marked point the value of the initial measurement and the marked range our margin of error.
If we now were to use the original quantizer shown in @fig:tmhd_example_enroll during both the enrollment and the reconstruction phases, we would risk a bit error, because the margin of error overlaps with the lower quantization bound $-a$, which we can call a point of uncertainty.
To alleviate this we generated helper data during enrollment as depicted in @fig:tmhd_enroll, we can make use of a different quantizer $cal(R)(1, 2, x)$ whose boundaries do not overlap with the error margin.
#scale(x: 90%, y: 90%)[
#figure(
grid(
columns: (1fr, 1fr),
[#figure(
include("../graphics/quantizers/two-metric/example_enroll.typ"),
caption: [Example enrollment]) <fig:tmhd_example_enroll>],
[#figure(
include("../graphics/quantizers/two-metric/example_reconstruct.typ"),
caption: [Example reconstruction]) <fig:tmhd_example_reconstruct>]
),
caption: [Example enrollment and reconstruction of @tmhdt. The window function describes the quantizer used to define the resulting bit. The red dot shows a possible @puf readout measurement with its blue marked strip as margin of error.])]
Publications @tmhd1 and @tmhd2 find all the relevant bounds for the enrollment and reconstruction phases under the assumption that the PUF readout (our input value $x$) is zero-mean Gaussian distributed.
//Because the parameters for symbol width and number of metrics always stays the same, it is easier to calculate #m//argin-note[obdA annehmen hier] the bounds for 8 equi-probable areas with a standard deviation of $sigma = 1$ first and then multiplying them with the estimated standard deviation of the PUF readout.
Because the parameters for symbol width and number of metrics always stay the same, we can -- without loss of generality -- assume the standard deviation as $sigma = 1$ and calculate the bounds for 8 equi-probable areas for this distribution.
This is done by finding two bounds $a$ and $b$ such, that
$ integral_a^b f_X(x) \dx = 1/8 $
This operation yields 9 bounds defining these areas $-infinity$, $-\T1$, $-a$, $-\T2$, $0$, $\T2$, $a$, $\T1$ and $+infinity$.
During the enrollment phase, we will use $plus.minus a$ as our quantizing bounds, returning $0$ if the absolute value of $x$ is smaller than $a$ and $1$ otherwise.
The corresponding metric is chosen based on the following conditions:
$ M = cases(
\M1\, x < -a or 0 < x < a,
\M2\, -a < x or 1 < a < x
)space.en. $
@fig:tmhd_enroll shows the curve of a quantizer $cal(Q)$ that would be used during the Two-Metric enrollment phase.
#scale(x: 90%, y: 90%)[
#grid(
columns: (1fr, 1fr),
[#figure(
include("../graphics/quantizers/two-metric/enrollment.typ"),
caption: [Two-Metric enrollment]) <fig:tmhd_enroll>],
[#figure(
include("../graphics/quantizers/two-metric/reconstruction.typ"),
caption: [Two-Metric reconstruction]) <fig:tmhd_reconstruct>]
)
]
As previously described, each of these metrics correspond to a different quantizer.
In the reconstruction phase, we can use the generated helper data and define a reconstructed bit based on the chosen metric as follows:
$ #grid(
columns: (1fr, 1fr),
align: (center, center),
math.equation($\M1: k = cases(0\, x < \T1 or \T2 < x, 1\, -\T1 < x < \T2),$, block: true, numbering: none),
math.equation($\M2: k = cases(0\, x < -\T2 or \T1 < x, 1\, -\T2 < x < \T1).$, block: true, numbering: none)
) $
@fig:tmhd_reconstruct illustrates the basic idea behind the Two-Metric method. Using the helper data, we will move the bounds of the original quantizer (@fig:tmhd_example_enroll) one octile to each side, yielding two new quantizers.
The advantage of this method comes from moving the point of uncertainty away from our enrollment-time readout.
=== #gls("smhdt", long: true)
Going on, the Two-Metric Helper Data Method can be generalized as shown in @smhd.
This generalization allows for higher-order bit quantization and the use of more than two metrics.
A key difference to the Two-Metric approach is the alignment of quantization areas.
Methods described in @tmhd1 and @tmhd2 use two bounds for 1-bit quantization, namely $plus.minus a$.
Contrary, the method introduced by Fischer in @smhd would look more like a sign-based quantizer if the configuration $cal(Q)(2, 1)$ is used, using only one quantization bound at $x=0$.
@fig:smhd_compar1 and @fig:smhd_compar2 illustrate this difference, .
#grid(
columns: (1fr, 1fr),
[#figure(
include("../graphics/quantizers/s-metric/s-metric-compar1.typ"),
caption: [Two-Metric enrollment]
)<fig:smhd_compar1>],
[#figure(
include("../graphics/quantizers/s-metric/s-metric-compar2.typ"),
caption: [S-Metric enrollment with 1-bit configuration]
)<fig:smhd_compar2>]
)
The generalization consists of two components:
- *Higher-order bit quantization* \
We can introduce more steps to our quantizer and use them to extract more than one bit out of our PUF readout.
- *More than two metrics* \
Instead of splitting each quantizer into only two equi-probable parts, we can increase the number of metrics at the cost of generating more helper data to increase reliability.
== Realization<sect:smhd_implementation>
We will now propose a specific realization of the S-Metric Helper Data Method. \
Instead of using the @puf readout directly for @smhdt, we can use a @cdf to transform these values into the tilde domain.
The only requirement we would need to meet here is that the @cdf of the probability distribution used is known.
This allows us to use equi-distant bounds for the quantizer instead of equi-probable ones.
From now on we will use the following syntax for quantizers that use the S-Metric Helper Data Method:
$ cal(Q)(S, M, tilde(x)), $
where $S$ defines the number of metrics, $M$ the number of bits and $tilde(x)$ a Tilde-Domain transformed PUF measurement.
=== Enrollment
To enroll our PUF key, we will first need to define the quantizer for higher order bit quantization and helper data generation.
Because our transformed PUF readout $tilde(x)$ can be interpreted as a realization of a uniformly distributed variable $tilde(X)$, we can define the width $Delta$ of our quantizer bins as follows:
$ Delta = frac(1, 2^M) . $<eq:delta>
For example, if we were to extract a symbol with the width of 2 bits from our PUF readout, we would need to evenly space $2^2 = 4$ bins. Using equation @eq:delta, the step size for a 2-bit quantizer would result to:
$ Delta' = lr(frac(1, 2^M) mid(|))_(M=2)= frac(1, 4) . $
@fig:smhd_two_bit shows a plot of the resulting quantizer function that would yield symbols with two bits for one measurement $tilde(x)$.
#figure(
include("../graphics/quantizers/two-bit-enroll.typ"),
caption: [2-bit quantizer]
)<fig:smhd_two_bit>
Right now, this quantizer wouldn't help us generating any helper data.
To achieve that, we will need to divide a symbol step -- one, that returns the corresponding quantized symbol - into multiple sub-steps.
Using $S$, we can define the step size $Delta_S$ as the division of $Delta$ by $S$:
$ Delta_S = frac(Delta, S) = frac(1, 2^M dot S) $<eq:delta_s>
/*After this definition #margin-note[Absatz nochmal neu], we need to make an adjustment to our previously defined quantizer function, because we cannot simply return the quantized value based on a quantizer with step size $Delta_s$.
That would just increase the amounts of bits we will extract out of one measurement.
Instead, we will need to return a tuple, consisting of the quantized symbol and the metric ascertained that we will save as helper data for later.
*/
We can now redefine our previously defined quantizer function to not only return the quantized symbol, but a tuple consisting of the quantized symbol and the metric ascertained that we will save as helper data for later.
Going on in our example, we could choose the amount of our metrics to be 2. According to @eq:delta_s, we would then half our step size:
$ Delta'_S = lr(frac(Delta', S)mid(|))_(S=2) = frac(1, 4 dot 2) = frac(1, 8) $
This means, we can update our quantizer function with the new step size $Delta'_S = frac(1, 8)$ and redefining its output as a tuple consisting of bit value and helper data.
We can visualize the quantizer that we will use during the enrollment phase of a 2-bit 2-metric configuration as depicted in @fig:smhd_2_2_en.
#grid(
columns: (1fr, 1fr),
[#scale(x: 80%, y: 80%)[
#figure(
include("../graphics/quantizers/s-metric/2_2_en.typ"),
caption: [2-bit 2-metric enrollment]
) <fig:smhd_2_2_en>]],
[#scale(x: 80%, y: 80%)[
#figure(
include("../graphics/quantizers/s-metric/3_2_en.typ"),
caption: [2-bit 3-metric enrollment]
) <fig:smhd_3_2_en>]])
To better demonstrate the generalization to $S$-metrics, @fig:smhd_3_2_en shows a 2-bit quantizer that generates helper data based on three metrics instead of two.
In that sense, increasing the number of metrics will increase the number of sub-steps for each symbol.
We can now perform the enrollment of a full PUF readout.
Each measurement will be quantized with out quantizer $cal(E)$, returning a tuple consisting of the quantized symbol and helper data.
$ kappa_i = cal(E)(s, m, tilde(x_i)) = (k, h)_i space.en. $ <eq:smhd_quant>
Performing the operation of @eq:smhd_quant for our whole set of measurements will yield a vector of tuples $bold(kappa)$.
=== Reconstruction
We already demonstrated the basic principle of the reconstruction phase in section @sect:tmhd, which showed the advantage of using more than one quantizer during reconstruction.
We will call our repeated measurement of $tilde(x)$ that is subject to a certain error $tilde(x^*)$.
To perform reconstruction with $tilde(x^*)$, we will first need to find all $S$ quantizers for which we generated the helper data in the previous step and then choose the one corresponding to the saved metric.
We have to distinguish the two cases, that $S$ is either even or odd:\
If $S$ is even, we need to define $S$ quantizers offset by multiples of $phi$.
We can define the ideal position for the quantizer bounds based on its corresponding metric as centered around the center of the metric.
We can find these new bounds graphically as depicted in @fig:smhd_find_bound_graph. We first determine the x-values of the centers of a metric (here M1, as shown with the arrows). We can then place the quantizer steps with step size $Delta$ (@eq:delta) evenly spaced around these points.
If the resulting quantizer bound is smaller than $0$ or bigger than $1$, we will either add or subtract $1$ from its value so it stays in the defined range of the tilde domain.
With these new points for the vertical steps of $cal(Q)$, we can draw the new quantizer for the first metric in @fig:smhd_found_bound_graph.
#grid(
columns: (1fr, 0.1fr, 1fr),
[#scale(x: 70%, y: 70%)[
#figure(
include("../graphics/quantizers/s-metric/2_2_find_quantizer.typ"),
caption: [Ideal centers and bounds for the M1 quantizer]
)<fig:smhd_find_bound_graph>]],
[#align(center)[#align(horizon)[#text(25pt)[$arrow.r.double$]]]],
[#scale(x: 70%, y: 70%)[
#figure(
include("../graphics/quantizers/s-metric/2_2_found_quantizer1.typ"),
caption: [Quantizer for the first metric]
)<fig:smhd_found_bound_graph>]]
)
As for metric 2, we can apply the same strategy and find the points for the vertical steps to be at $1/16, 5/16, 9/16$ and $13/16$. This quantizer is shown together with the first-metric quantizer in @fig:smhd_2_2_reconstruction, forming the complete quantizer for the reconstruction phase of a 2-bit 2-metric configuration $cal(R)(2,2,tilde(x))$.
#grid(
columns: (1fr, 1fr),
[
#scale(x: 80%, y: 80%)[
#figure(
include("../graphics/quantizers/s-metric/2_2_reconstruction.typ"),
caption: [2-bit 2-metric reconstruction quantizer]
)<fig:smhd_2_2_reconstruction> ]
],
[
#scale(x: 80%, y: 80%)[
#figure(
include("../graphics/quantizers/s-metric/3_2_reconstruction.typ"),
caption: [2-bit 3-metric reconstruction quantizer],
)<fig:smhd_3_2_reconstruction> ]
]
)
Analytically, the offset we are applying to $cal(E)(2, 2, tilde(x))$ can be defined as
$ Phi = lr(frac(1, 2^M dot S dot 2)mid(|))_(M=2, S=2) = 1 / 16 space.en. $<eq:offset>
$Phi$ is the constant that we will multiply with a certain metric index $i in [- S/2, ..., S/2]$ to obtain the metric offset $phi$, which is used to define each of the $S$ different quantizers for reconstruction.
//This is also shown in @fig:smhd_2_2_reconstruction, as our quantizer curve is moved $1/16$ to the left and the right.
In @fig:smhd_2_2_reconstruction, the two metric indices $i = plus.minus 1$ will be multiplied with $Phi$, yielding two quantizers, one moved $1/16$ to the left and one moved $1/16$ to the right.
If a odd number of metrics is given, the offset can still be calculated using @eq:offset. Additionally, we will keep the original quantizer used during enrollment as the quantizer for metric $(s-1)/2$ (@fig:smhd_3_2_reconstruction).
To find all metric offsets for values of $S > 3$, we can use @alg:find_offsets.
We can calculate $phi$ based on $S$ and $M$ using @eq:offset. The resulting list of offsets is correctly ordered and can be mapped to the corresponding metrics in ascending order.// as we will show in @fig:4_2_offsets and @fig:6_2_offsets.
#figure(
kind: "algorithm",
supplement: [Algorithm],
include("../pseudocode/offsets.typ")
)<alg:find_offsets>
==== Offset properties<par:offset_props>
//#inline-note[Diese section ist hier etwas fehl am Platz, ich weiß nur nicht genau wohin damit. Außerdem ist sie ein bisschen durcheinander geschrieben]
Before we go on and experimentally test this realization of the S-Metric method, let's look deeper into the properties of the metric offset value $phi$.
Comparing @fig:smhd_2_2_reconstruction, @fig:smhd_3_2_reconstruction and their respective values of @eq:offset, we can observe, that the offset $Phi$ gets smaller the more metrics we use.
#figure(
table(
columns: (11),
inset: 7pt,
align: center + horizon,
[$M$],
[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],
[$Phi$],[$1/8$],table.cell(fill: gray)[$1/16$], [$1/24$], table.cell(fill:gray)[$1/32$], [$1/40$], table.cell(fill:gray)[$1/48$], [$1/56$], table.cell(fill:gray)[$1/64$], [$1/72$], table.cell(fill:gray)[$1/80$]
),
caption: [Offset values for 2-bit configurations]
)<tab:offsets>
As previously stated, we will need to define $S$ quantizers, $S/2$ times to the left and $S/2$ times to the right.
For example, setting the parameter $S$ to $4$ means we will need to move the enrollment quantizer $2$ times to the left and right.
As we can see in @fig:4_2_offsets, $phi$ for the maximum metric indices $i = plus.minus 2$ are identical to the offsets of a 2-bit 2-metric configuration.
In fact, this property carries on for higher even numbers of metrics, as shown in @fig:6_2_offsets.
#grid(
columns: (1fr, 1fr),
[#figure(
table(
columns: (5),
inset: 7pt,
align: center + horizon,
[$bold(i)$], [$-2$], [$-1$], [$1$], [$2$],
[*Metric*], [M1], [M2], [M3], [M4],
[$bold(phi)$], [$-frac(1, 16)$], [$-frac(1, 32)$], [$frac(1, 32)$], [$frac(1, 16)$]
),
caption: [2-bit 4-metric offsets]
)<fig:4_2_offsets>
],
[#figure(
table(
columns: (7),
align: center + horizon,
inset: 7pt,
[$bold(i)$], [$-3$], [$-2$], [$-1$], [$1$], [$2$], [$3$],
[*Metric*], [M1], [M2], [M3], [M4], [M5], [M6],
[$bold(phi)$], [$-frac(1, 16)$], [$-frac(1, 24)$], [$-frac(1, 48)$], [$frac(1, 48)$], [$frac(1, 24)$], [$frac(1, 16)$]
),
caption: [2-bit 6-metric offsets]
)<fig:6_2_offsets>
]
)
At $s=6$ metrics, the biggest metric offset we encounter is $phi = 1/16$ at $i = plus.minus 3$.\
This biggest (or maximum) offset is of particular interest to us, as it tells us how far we deviate from the original quantizer used during enrollment.
The maximum offset for a 2-bit configuration $phi$ is $1/16$ and we only introduce smaller offsets in between if we use a higher even number of metrics.
More formally, we can define the maximum metric offset as follows:
$ phi_"max" = frac(floor(frac(S,2)), 2^M dot S dot 2) $
/*More formally, we can define the maximum metric offset for an even number of metrics as follows:
$ phi_("max,even") = frac(frac(S,2), 2^M dot S dot 2) = frac(1, 2^M dot 4) $<eq:max_offset_even>
Here, we multiply $phi$ from @eq:offset by the maximum metric index $i_"max" = S/2$.
Now, if we want to find the maximum offset for a odd number of metrics, we need to modify @eq:max_offset_even, more specifically its numerator.
For that reason, we will decrease the parameter $m$ by $1$, that way we will still perform a division without remainder:
$
phi_"max,odd" &= frac(frac(S-1, 2), 2^n dot S dot 2)\
&= lr(frac(S-1, 2^M dot S dot 4)mid(|))_(M=2, S=3) = 1/24
$
*/
//It is important to note, that $phi_"max,odd"$, unlike $phi_"max,even"$, is dependent on the parameter $S$ as we can see in @tb:odd_offsets.
It is important to note, that $phi_"max"$ is dependent on the parameter $S$ if $S$ is an odd number.
#figure(
table(
columns: (5),
align: center + horizon,
inset: 7pt,
[*S*],[3],[5],[7],[9],
[$bold(phi_"max,odd")$],[$1/24$],[$1/20$],[$3/56$],[$1/18$]
),
caption: [2-bit maximum offsets, odd]
)<tb:odd_offsets>
The higher $S$ is chosen, the closer we approximate $phi_"max"$ for even choices of $S$, as shown in @eq:offset_limes.
This means, while also keeping the original quantizer during the reconstruction phase, the maximum offset for an odd number of metrics will always be smaller than for an even number.
$
lim_(S arrow.r infinity) phi_"max,odd" &= frac(floor(frac(S,2)), 2^M dot S dot 2) = frac(S-1, 2^M dot S dot 4) #<eq:offset_limes>\
&= frac(1, 2^M dot 4) = phi_"max,even"
$
Because $phi_"max,odd"$ only approximates $phi_"max,even"$ if $S arrow.r infinity$ we can assume, that configurations with an even number of metrics will always perform marginally better than configurations with odd numbers of metrics because the bigger maximum offset allows for better reconstructing capabilities. //#margin-note[Sehr unglücklich mit der formulierung hier]
== Improvements<sect:smhd_improvements>
The S-Metric Helper Data Method proposed by Fischer in @smhd can be improved by using Gray-coded labels for the quantized symbols instead of naive labelling.
#align(center)[
#scale(x: 80%, y: 80%)[
#figure(
include("../graphics/quantizers/two-bit-enroll-gray.typ"),
caption: [Gray Coded 2-bit quantizer]
)<fig:2-bit-gray>]]
@fig:2-bit-gray shows a 2-bit quantizer with gray-coded labelling.
In this example, we have an advantage at $tilde(x) approx 0.5$, because a quantization error only returns one wrong bit instead of two.
Furthermore, the transformation into the Tilde-Domain could also be performed using the @ecdf to achieve a more precise uniform distribution because we do not have to estimate a standard deviation of the input values.
//#inline-note[Hier vielleicht noch eine Grafik zur Visualisierung?]
== Experiments<sect:smhd_experiments>
We tested the implementation of @sect:smhd_implementation with the dataset of @dataset.
The dataset contains counts of positives edges of a ring oscillator at a set evaluation time $D$. Based on the count and the evaluation time, the frequency of a ring oscillator can be calculated using: $f = 2 dot frac(k, D)$.
Because we want to analyze the performance of the S-Metric method over different temperatures, both during enrollment and reconstruction, we are limited to the experimental measurements of @dataset which varied the temperature during the FPGA operation.
We will have measurements of $50$ FPGA boards available with $1600$ and $1696$ ring oscillators each.
The two measurement sets are obtained from different slices of the FPGA board where the only difference to note is the number of ring oscillators available.
To obtain the values to be processed, we subtract them in pairs, yielding $800$ and $848$ ring oscillator frequency differences _df_.\
Because we can assume that the frequencies _f_ are i.i.d., the difference _df_ can also be assumed to be i.i.d.
To apply the values _df_ to our implementation of the S-Metric method, we will first transform them into the Tilde-Domain using an inverse CDF, resulting in uniform distributed values $tilde(x)$.
Our resulting dataset consists of #glspl("ber") for quantization symbol widths of up to $6 "bits"$ evaluated with generated helper-data from up to $100 "metrics"$.
In the following section, we will often set the maximum number of metrics to be $S=100$.
This choice refers to the asymptotic behaviour of the @ber and can be equated with the choice $S arrow infinity$.
//We chose not to perform simulations for bit widths higher than $6 "bits"$, as we will see later that we have already reached a bit error rate of approx. $10%$ for these configurations.
#pagebreak()
=== Results & Discussion
The bit error rate of different S-Metric configurations for naive labelling can be seen in @fig:global_errorrates.
For this analysis, enrollment and reconstruction were both performed at room temperature. //and the quantizer was naively labelled.
#figure(
image("../graphics/25_25_all_error_rates_fixed.svg", width: 90%),
caption: [Bit error rates for same-temperature execution. Here we can already observe the asymptotic #glspl("ber") for higher metric numbers. The error rate is scaled logarithmically here.]
)<fig:global_errorrates>
We can observe two key properties of the S-Metric method in @fig:global_errorrates.
//The exponential growth of the error rate of classic 1-metric configurations can be observed through the increase of the error rates.
The exponential growth of the @ber can be observed if we set $S=1$ and increase $M$ up to $6$.
Also, as we expanded on in @par:offset_props, at some point using more metrics will no longer improve the bit error rate of the key.
At a symbol width of $M >= 6$ bits, no further improvement through the S-Metric method can be observed.
#figure(
include("../graphics/plots/errorrates_changerate.typ"),
caption: [Asymptotic performance of @smhdt]
)<fig:errorrates_changerate>
This tendency can also be shown through @fig:errorrates_changerate.
Here, we calculated the quotient of the bit error rate using one metric and 100 metrics.
From $M >= 6$ onwards, $(op("BER")(1, 2^M)) / (op("BER")(100, 2^M))$ approaches $~1$, which means, no real improvement is possible anymore through the S-Metric method.
==== Impact of helper data size
The amount of helper data bits required by @smhdt is defined as a function of the number of metrics as $log_2(S)$.
The overall extracted-bits to helper-data-bits ratio can be defined here as $cal(r) = frac(M, log_2(S))$
#figure(
table(
columns: (7),
inset: 7pt,
align: center + horizon,
[$bold(M)$], [$1$], [$2$], [$3$], [$4$], [$5$], [$6$],
[$bold(S)$], [$2$], [$4$], [$8$], [$16$], [$32$], [$64$],
[*@ber*], [$0.012$], [$0.9 dot 10^(-4)$], [$0.002$], [$0.025$], [$0.857$], [$0.148$],
),
caption: [S-Metric performance with same bit-to-metric ratios]
)<fig:smhd_ratio_performance>
If we take a look at the error rates of configurations for which $cal(r)$ is $800 dot 1$, we can observe a decline in performance of @smhdt for general higher-bit quantization processes.
This behaviour is also shown in @fig:smhd_ratio_performance.
==== Impact of temperature<sect:impact_of_temperature>
We will now take a look at the impact on the error rates of changing the temperature both during the enrollment and the reconstruction phase.
The most common case to look at, is if we consider a fixed temperature during enrollment, most likely $25°C$.
Since we wont always be able to recreate lab-like conditions during the reconstruction phase, it makes sense to look at the error rates at which reconstruction was performed at different temperatures.
#figure(
include("../graphics/plots/temperature/25_5_re.typ"),
caption: [#glspl("ber") for reconstruction at different temperatures. Generally, the further we move away from the enrollment temperature, the worse the #gls("ber") gets. ]
)<fig:smhd_tmp_reconstruction>
@fig:smhd_tmp_reconstruction shows the results of this experiment conducted with a 2-bit configuration.\
As we can see, the further we move away from the temperature of enrollment, the higher the #glspl("ber").
We can observe this property well in detail in @fig:global_diffs.
#scale(x: 90%, y: 90%)[
#figure(
include("../graphics/plots/temperature/global_diffs/global_diffs.typ"),
caption: [#glspl("ber") for different enrollment and reconstruction temperatures. The lower number in the operating configuration is assigned to the enrollment phase, the upper one to the reconstruction phase. The correlation between the #gls("ber") and the temperature is clearly visible here]
)<fig:global_diffs>]
Here, we compared the asymptotic performance of @smhdt for different temperatures both during enrollment and reconstruction. First we can observe that the optimum temperature for the operation of @smhdt in both phases for the dataset @dataset is $35°C$ instead of the expected $25°C$.
Furthermore, the @ber seems to be almost directly determined by the absolute temperature difference, especially at higher temperature differences, showing that the further apart the temperatures of the two phases are, the higher the @ber.
==== Gray coding
In @sect:smhd_improvements, we discussed how a gray coded labelling for the quantizer could improve the bit error rates of the S-Metric method.
Because we only change the labelling of the quantizing bins and do not make any changes to #gls("smhdt") itself, we can assume that the effects of temperature on the quantization process are directly translated to the gray-coded case.
@fig:smhd_gray_coding shows the comparison of applying #gls("smhdt") at room temperature for both naive and gray-coded labels.
There we can already observe the improvement of using gray-coded labelling, but the impact of this change of labels can really be seen in @tab:gray_coded_impact.
As we can see, the improvement rises rapidly to a peak at a bit width of M=3 and then falls again slightly.
This effect can be explained with the exponential rise of the #gls("ber") for higher bit widths $M$.
For $M>3$ the rise of the #gls("ber") predominates the possible improvement by applying a gray-coded labelling.
#figure(
table(
columns: (6),
align: center + horizon,
inset: 7pt,
[1],[2],[3],[4], [5], [6],
[$0%$], [$24.75%$], [$47.45%$], [$46.97%$], [$45.91%$], [$37.73%$]
),
caption: [Improvement of using gray-coded instead of naive labelling, per bit width]
)<tab:gray_coded_impact>
#figure(
image("./../graphics/plots/gray_coding/3dplot.svg"),
caption: [Comparison between #glspl("ber") using naive labelling and gray-coded labelling]
)<fig:smhd_gray_coding>
Using the dataset, we can estimate the average improvement for using gray-coded labelling to be at $33%$.

View file

@ -2,96 +2,4 @@
#import "@preview/bob-draw:0.1.0": *
= Introduction
In the field of cryptography, @puf devices are a popular tool for key generation and storage @PUFIntro @PUFIntro2.
In general, a @puf refers to a type of circuit that exhibits slightly different behaviors during operation due to minor variations in the manufacturing process.
Since the behaviour of one @puf device is now only reproducible on itself and not on a device of the same type with the same manufacturing process, it can be used for secure key generation and/or storage.\
To improve the reliability of the keys generated and stored using the @puf, various #glspl("hda") have been introduced.
The general operation of a @puf with a @hda can be divided into two separate stages: _enrollment_ and _reconstruction_ as shown in @fig:puf_operation @PUFChartRef.
#figure(
include("../charts/PUF.typ"),
caption: [@puf model description using enrollment and reconstruction @PUFChartRef]
)<fig:puf_operation>
The enrollment stage will usually be performed in near ideal, lab-like conditions i.e. at room temperature ($25°C$).
During this phase, a first @puf readout $nu$ with corresponding helper data $h$ is generated.
Going on, reconstruction can now be performed under varying conditions, for example at a higher temperature.
Here, slightly different @puf readout $nu^*$ is generated.
Using the helper data $h$ the new @puf readout $nu^*$ can be improved to be less deviated from $v$ as before.
One possible implementation of this principle is called _Fuzzy Commitment_ @fuzzycommitmentpaper @ruchti2021decoder.
Previous works already introduced different #glspl("hda") with various strategies @delvaux2014helper @maes2009soft.
The simplest form of helper-data one could generate is reliability information for every @puf bit.
Here, the @hda marks unreliable @puf bits that are then either discarded during reconstruction or rather corrected using an error correction code after the quantization process.
Going on, publications @tmhd1 and @tmhd2 introduced a metric-based @hda as @tmhdt.
The main goal of such a @hda is to improve the reliability of the @puf during the quantization step of the enrollment phase.
To achieve that, helper data is generated to define multiple quantizers for the reconstruction phase to minimize the risk of bit errors.
A generalization outline to extend @tmhdt for higher order bit quantization has already been proposed by Fischer in @smhd.
In the course of this work, we will first take a closer look at @smhdt as proposed by Fischer @smhd and provide a concrete realization for this method.
We will also propose the idea of a method to shape the input values of a @puf to better fit within the bounds of a multi-bit quantizer which we call @bach and discuss how such a @hda can be successfully implemented in the future.
== Notation
To ensure a consistent notation of functions and ideas, we will now introduce some conventions and definitions.
Random distributed variables will be notated with a capital letter, i.e. $X$.
Realizations will be the corresponding lower case letter, $x$.
Values of $x$ subject to some kind of error are marked with a $*$ in the exponent e.g., $x^*$.
Vectors will be written in bold text: e.g., $bold(k)$ represents a vector of quantized symbols.
Matrices are denoted with a bold capital letter: $bold(M)$.
We will call a quantized symbol $k$. $k$ consists of all possible binary symbols, i.e. $0, 01, 110$.
A quantizer will be defined as a function $cal(Q)(x, bold(a))$ that returns a quantized symbol $k$.
We also define the following special quantizers for metric based #glspl("hda"):
A quantizer used during the enrollment phase is defined by a calligraphic $cal(E)$.
For the reconstruction phase, a quantizer will be defined by a calligraphic $cal(R)$
@example-quantizer shows the curve of a 2-bit quantizer that receives $tilde(x)$ as input. In the case, that the value of $tilde(x)$ equals one of the four bounds, the quantized value is chosen randomly from the relevant bins.
#figure(
include("../graphics/quantizers/two-bit-enroll.typ"),
caption: [Example quantizer function]) <example-quantizer>
For the S-Metric Helper Data Method, we introduce a function
$ cal(Q)(S,M) , $<eq-1>
where $S$ determines the number of metrics and $M$ the bit width of the symbols.
The corresponding metric is defined through the lower case $s$, the bit symbol through the lower case $m$.
To compare both @smhdt and @bach, we will use a ratio $cal(r) = frac("Extracted bits", "Helper data bits")$.
This ratio gives us an idea how many helper data bits were used to obtain a quantized symbol.
$cal(r)$ is smaller than $1$ if the amount of helper data bits per quantized symbol is bigger than the symbol bit width itself and bigger than $1$ otherwise.
=== Tilde Domain<tilde-domain>
The tilde domain describes the range of numbers between $0$ and $1$, which is defined by the image of a @cdf.
As also described in @smhd, we will use a @cdf to transform the real PUF values into the tilde domain.
This transformation can be performed using the function $xi = tilde(x)$. The key property of this transformation is the resulting uniform distribution of $x$.
Considering a normal distribution, the CDF is defined as
$ xi(frac(x - mu, sigma)) = frac(1, 2)[1 + op("erf")(frac(x - mu, sigma sqrt(2)))]. $
==== #gls("ecdf", display: "Empirical cumulative distribution function (eCDF)")
We will not always be able to find an analytical description of a probability distribution and its corresponding @cdf.
Alternatively, an @ecdf can be constructed through sorting the empirical measurements of a distribution @dekking2005modern.
Although less accurate, this method allows a more simple and less computationally complex way to transform real valued measurements into the tilde domain.
We will mainly use the @ecdf in @chap:smhd because of the difficulty of finding an analytical description for the @cdf of a weighted linear combination of random variables.
The function for an @ecdf can be defined as
$
xi_#gls("ecdf") (x) = frac("number of elements in " bold(z)", s.t" <= x, n) in [0, 1],
$<eq:ecdf_def>
where $n$ defines the number of elements in the vector $bold(z)$.
If the vector $bold(z)$ were to contain the elements $[1, 3, 4, 5, 7, 9, 10]$ and $x = 5$, @eq:ecdf_def would result to $xi_#gls("ecdf") (5) = frac(4, 7)$.\
The application of @eq:ecdf_def on $X$ will transform its values into the empirical tilde domain.
We can also define an inverse @ecdf:
$
xi_#gls("ecdf")^(-1) (tilde(x)) = tilde(x) dot n
$<eq:ecdf_inverse>
The result of @eq:ecdf_inverse is the index $i$ of the element $z_i$ from the vector of realizations $bold(z)$.
To apply the @ecdf to our numerical results later, we will sort the vector of realizations $bold(z)$ of a random distributed variable $Z$ in ascending order.
#lorem(500)

View file

@ -1,26 +0,0 @@
= Conclusion and Outlook
During the course of this work, we took a closer look at an already introduced @hda, @smhdt and provided a concrete realization.
Our experiments showed that after a certain point, using more metrics $S$ won't improve the @ber any further as they behave asymptotically for $S arrow infinity$.
Furthermore, we concluded that for higher choices of the symbol width $M$, @smhdt will not be able to improve on the @ber, as the initial error is too high.
An interesting addition to our analysis provided the improvement of Gray-coded labelling for the quantizer as this resulted in an improvement of $approx 30%$.
Going on, we introduced the idea of a new @hda which we called Boundary Adaptive Clustering with Helper data @bach.
Here we aimed to utilize the idea of moving our initial @puf measurement values away from the quantizer bound to reduce the @ber using weighted linear combinations of our input values.
Although this method posed promising results for a sign-based quantization yielding an improvement of $approx 96%$ in our testing, finding a good approach to generalize this concept turned out to be difficult.
The first issue was the lack of an analytical description of the probability distribution resulting from the linear combinations.
We accounted for that by using an algorithm that alternates between defining the quantizing bounds using an @ecdf and optimizing the weights for the linear combinations based on the found bounds.
The initial loose definition to find ideal linear combinations which maximize the distance to their nearest quantization bounds did not result in a stable probability distribution over various iterations.
Thus, we proposed a different approach to approximate the linear combination to the centers between the quantizing bounds.
This method resulted in a stable probability distribution, but did not provide any meaningful improvements to the @ber in comparison to not using any helper data at all.
Future investigations of the @bach idea might find a solution to the convergence of the bound distance maximization strategy.
Since the vector of bounds $bold(b)$ is updated every iteration of @bach, a limit to the deviation from the previous position of a bound might be set.
Furthermore, a recursive approach to reach higher order bit quantization inputs might also result in a converging distribution.
If we do not want to give up the approach using a vector of optimal points $bold(cal(o))$ as in the center point approximation, a way may be found to increase the distance between all optimal points $bold(cal(o))$ to achieve a better separation for the results of the linear combinations in every quantizer bin.
If a converging realization of @bach is found, using fractional weights instead of $plus.minus 1$ could provide more flexibility for the outcome of the linear combinations.
Ultimately, we can build on this in the future and provide a complete key storage system using @bach or @smhdt to improve the quantization process.
But in the end, the real quantizers were the friends we made along the way.