drbd: if the replication link breaks during handshake, keep retrying

The 8.3.12 commit drbd: Bugfix for the connection behavior fixes a
"wasted established connection", if a former connection attempt failed
during its early stages.

However it opened a window for a regression, if a connection attempt
fails during its last stages.  The result was a terminated receiver
thread, that left behind the supposedly transient "C_UNCONNECTED" state.
Any later requests to change the connection state fail, as they wait for
the connection state to "stabilize".

Fix: short circuit and keep retrying to restablish a new connection,
if we don't reach C_WF_REPORT_PARAMS.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This commit is contained in:
Lars Ellenberg 2012-11-05 11:54:30 +01:00 committed by Philipp Reisner
parent 063eacf88c
commit ed635cb067

View File

@ -1051,7 +1051,7 @@ randomize:
rcu_read_unlock();
rv = conn_request_state(tconn, NS(conn, C_WF_REPORT_PARAMS), CS_VERBOSE);
if (rv < SS_SUCCESS) {
if (rv < SS_SUCCESS || tconn->cstate != C_WF_REPORT_PARAMS) {
clear_bit(STATE_SENT, &tconn->flags);
return 0;
}