Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func NewSampler ¶
func NewSampler[T any](opts ...SamplerOption[T]) *sampler[T]
NewSampler creates a new Sampler instance with the provided configuration options. If no hash function is provided, it defaults to ZeroHash (always returns 0, false). If no max hash function is provided, it defaults to 100% sampling rate.
Example usage:
sampler := NewSampler[string](
WithHashFunc(hashing.DefaultStringHasher),
WithStaticRate(0.1), // 10% sampling rate
WithSkipInvalid(),
)
Types ¶
type Meta ¶
type Meta struct {
// IsHashValid indicates whether the hash function successfully generated
// a valid hash for the input value. False for empty/zero values.
IsHashValid bool
// Bucket is a number from 1 to 100 which represents even "chunks" of the uint32 keyspace.
// Bucket helps to make sure that the "hash" generated by Sample is evenly distributed.
// Ideally, if a value used for sampling is evenly distributed (or random),
// Bucket should be evenly distributed too.
// "bucket" with number 0 is reserved and should not be used.
Bucket int32
// Hash is the uint32 hash value generated for the input.
// This value is compared against MaxHash to determine sampling decision.
Hash uint32
// MaxHash is the threshold value computed from the sampling rate.
// Values with hash <= MaxHash are included in the sample.
MaxHash uint32
}
Meta contains metadata about a sampling operation, providing insights into the internal calculations and decisions made during sampling.
type Sampler ¶
type Sampler[T any] interface { // Sample returns "true" for objects that "survived" sampling // and should be taken and "false" for object that should be skipped. Sample(T) bool // SampleWithMeta returns "true" (take) or "false" (skip), just like the call to Sample. // Additionally, SampleWithMeta return Meta struct with the internals of the sampling call. // Meta can be used for metrics, logging, debugging, etc. SampleWithMeta(T) (bool, Meta) }
Sampler is a hash-based in-memory sampling.
When a message is "sampled" - the algorithm decides if the message should be "taken" (true) or "skipped" (false).
Sampler does not have any dependencies and does not require a third-party storage. All operations are performed locally.
On the high level, Sample does the following: * it uses uint32 number as a "full" keyspace: [0, max(uint32)]. * clients set "sampling rate", i.e. "0.05" is 5%, "0.5" is 50%. * "sampling rate" must be between 0.0 and 1.0 * "sampling rate" is converted to a "max-hash" number, e.g. "0.05" becomes "214,748,364" * Sample generates a "hash" (uint32 number) for each message being sampled * after, the hash is compared with the "max-hash" * if "hash" is less than "max-hash" - the message added to the result, if not - skipped.
Here's an illustration of how it all works:
If Sampling rate is 5% (0.05) Then, the "max hash" is: 214,748,364
0 214,748,364 4,294,967,295 |-----------------|------------- ... ---------------------------| | 1 | 2 | 3 | 4 | 5 | ... | 99 | 100 | <- bucket
///////////////// \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
^ ^
| |
if the hash (number) if the hash (number)
is in this area, is in this area,
call to Sample() will return call to Sample() will return
"true" (should take) "false" (should skip)
type SamplerOption ¶
type SamplerOption[T any] func(c *sampler[T])
SamplerOption is a functional option type for configuring a Sampler instance. It follows the functional options pattern to provide flexible configuration.
func WithHashFunc ¶
func WithHashFunc[T any](hasher hashing.HashFunc[T]) SamplerOption[T]
WithHashFunc configures the sampler to use a custom hash function. The hash function determines how input values are converted to uint32 hashes. See: hashing.HashFunc
func WithMaxHashFunc ¶
func WithMaxHashFunc[T any](maxHash hashing.MaxHashFunc[T]) SamplerOption[T]
WithMaxHashFunc configures the sampler to use a custom max hash function. The max hash function determines the sampling threshold for each input. See: hashing.MaxHashFunc
func WithSkipInvalid ¶
func WithSkipInvalid[T any]() SamplerOption[T]
WithSkipInvalid configures the sampler to skip inputs that cannot be hashed (e.g., empty strings, nil values). By default, invalid inputs are included, e.g. a call to Sample() returns "true".
func WithStaticRate ¶
func WithStaticRate[T any](rate float64) SamplerOption[T]
WithStaticRate configures the sampler with a fixed sampling rate. The rate should be between 0.0 (0%) and 1.0 (100%). This creates a static max hash function that applies the same rate to all inputs.