Bayesian networks is sort of a generalization of the Hidden Markov model. Instead of having a really rigid structure of state evidence. Here in Bayesian networks we will table about arbitrary relationships variables, independences among them and what we can do in order to perform inference.
Note that today we are still going to try to infer and reason about knowledge that we cannot observe directly, but we are given other related information.
The expression is massively simplified because when we compute the joint probabilities among all the variables in the network, we are actually multiply the CPTs of the each random variable given its parents. (we are refering only direct parents, do not worry about grandparents)
Even X1 and X2 are independent, we can still build a bayes net like in the red box. It's correct but unnecessary. Because when we write down the CPT for this net, we can actually extra the information that P(X2|X1)P(X1) = P(X2), which means X1 and X2 are independent.
So if we know X1 and X2 are independent in the very beginning, then it's unnecessary to build a Bayes net like the one in the red box. It's unnecessary because we are storing a larger table in this way, remember the smaller the table, the better.
And if we find that given a true Bayes net, no path guarantees independence between two RVs. But presence of a path doesn't guarantee there must be a relation between two RVs. Just like the case in the red box, we have a path from X1 to X2 but after calculation we know that they are actually independent. So we would say its CPT contains redundant information.
Just like we said, Bayes net representations are not unique, but we always want a simple and logical one. We are not going to learn how to build a good Bayes net here.