I got stuck at this paragraph:
A causal model goes beyond the graph by including specific probability functions P(Xi|pai) for how to calculate the probability of each node Xi taking on the value xi given the values pai of xi's immediate ancestors. It is implicitly assumed that the causal model factorizes, so that the probability of any value assignment x to the whole graph can be calculated using the product:
P(x)=∏iP(xi|pai)
Then the counterfactual conditional P(x|do(Xj=xj)) is calculated via:
P(x|do(Xj=xj))=∏i≠jP(xi|pai)
First of all, it seems to me that "..Xi taking on the value xi given the values pai of xi's immediate ancestors" should be "…Xi's immediate ancestors" (capital X). Otherwise I didn't understand this part.
Further down, I don't know what "do(Xj=xj)" means and I'm unable to figure out from context. So this is where I stopped reading.
In fairness, I'm not actually a computer scientist, but this is the closest description of me among the advanced courses of this topic.