Understanding Moments
Why are a distribution's moments called "moments"? How does the equation for a moment capture the shape of a distribution? Why do we typically only study four moments? I found one useful blog discussing moments that is https://gregorygundersen.com/
In short moments describe how the probability mass of a random variable is distributed.
The zeroth moment, total mass, quantifies the fact that all distribution’s have a total mass of one.
The first moment, the mean, specifies the distribution’s location, shifting the center of mass left or right.
The second moment, variance, specifies the scale or spread; loosely speaking, flatter or more spread out distributions are “more random”.
The third moment, skewness, quantifies the relative size of the two tails of a distribution; the sign indicates which tail is bigger and the magnitude indicates by how much.
The fourth moment, kurtosis, captures the absolute size of the two tails.
Higher standardized moments simply recapitulate the information in skewness and kurtosis; by convention, we ignore these in favor of the third and fourth standardized moments.
Finally, moments are important theoretically because they provide an alternative way to fully and uniquely specify a probability distribution, a fact that is intuitive if you understand how moment’s quantify a distribution’s location, spread, and shape.
Reference:
https://gregorygundersen.com/blog/2020/04/11/moments/