This is from Autoregressive Models Chapter
To see why, let us consider the conditional for the last dimension, given by $p(x_n|x_{\lt n})$. In order to fully specify this conditional, we need to specify a probability for $2^{n−1}$ configurations of the variables $x_1,x_2,\ldots,x_{n−1}$. Since the probabilities should sum to $1$, the total number of parameters for specifying this conditional is given by $2^{n−1}−1$. Hence, a tabular representation for the conditionals is impractical for learning the joint distribution factorized via chain rule.
Shouldn't it be $2^{n-1}$ instead of $2^{n-1}-1$ here ? Why minus $1$ ? In my understanding, the $n$-th random variable is dependent on $n-1$ random variables, in binary case, there should be $2^{n-1}$ rows in the table. In every single row, the entries should add up to $1$, so only one of the two entries in this row needs specifying. Thus one parameter for each row, it should be $2^{n-1}$.