#1.Non-replicated tables, internal_replication=false.
Data inserted into the Distributed table is inserted into both local tables and if there are no problems during inserts, the data on both local tables stays in sync. We call this “poor man’s replication” because replicas easily diverge in case of network problems and there is no easy way to determine which one is the correct replica.
#2.Replicated tables, internal_replication=true.
Data inserted into the Distributed table is inserted into only one of the local tables, but is transferred to the table on the other host via the replication mechanism. Thus data on both local tables stays in sync. This is the recommended configuration.
#3.Non-replicated tables, internal_replication=true.
Data is inserted into only one of the local tables, but there is no mechanism to transfer it to the other table. So local tables on different hosts end up with different data and you get confusing results