net-timestamp: expand documentation

Expand Documentation/networking/timestamping.txt with new
interfaces and bytestream timestamping. Also minor
cleanup of the other text.

Import txtimestamp.c test of the new features.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Willem de Bruijn 2014-08-31 21:27:47 -04:00 committed by David S. Miller
parent c5a65680b3
commit 8fe2f761ca
3 changed files with 757 additions and 77 deletions

View File

@ -1,102 +1,307 @@
The existing interfaces for getting network packages time stamped are:
1. Control Interfaces
The interfaces for receiving network packages timestamps are:
* SO_TIMESTAMP * SO_TIMESTAMP
Generate time stamp for each incoming packet using the (not necessarily Generates a timestamp for each incoming packet in (not necessarily
monotonous!) system time. Result is returned via recv_msg() in a monotonic) system time. Reports the timestamp via recvmsg() in a
control message as timeval (usec resolution). control message as struct timeval (usec resolution).
* SO_TIMESTAMPNS * SO_TIMESTAMPNS
Same time stamping mechanism as SO_TIMESTAMP, but returns result as Same timestamping mechanism as SO_TIMESTAMP, but reports the
timespec (nsec resolution). timestamp as struct timespec (nsec resolution).
* IP_MULTICAST_LOOP + SO_TIMESTAMP[NS] * IP_MULTICAST_LOOP + SO_TIMESTAMP[NS]
Only for multicasts: approximate send time stamp by receiving the looped Only for multicast:approximate transmit timestamp obtained by
packet and using its receive time stamp. reading the looped packet receive timestamp.
The following interface complements the existing ones: receive time * SO_TIMESTAMPING
stamps can be generated and returned for arbitrary packets and much Generates timestamps on reception, transmission or both. Supports
closer to the point where the packet is really sent. Time stamps can multiple timestamp sources, including hardware. Supports generating
be generated in software (as before) or in hardware (if the hardware timestamps for stream sockets.
has such a feature).
SO_TIMESTAMPING:
Instructs the socket layer which kind of information should be collected 1.1 SO_TIMESTAMP:
and/or reported. The parameter is an integer with some of the following
bits set. Setting other bits is an error and doesn't change the current
state.
Four of the bits are requests to the stack to try to generate This socket option enables timestamping of datagrams on the reception
timestamps. Any combination of them is valid. path. Because the destination socket, if any, is not known early in
the network stack, the feature has to be enabled for all packets. The
same is true for all early receive timestamp options.
SOF_TIMESTAMPING_TX_HARDWARE: try to obtain send time stamps in hardware For interface details, see `man 7 socket`.
SOF_TIMESTAMPING_TX_SOFTWARE: try to obtain send time stamps in software
SOF_TIMESTAMPING_RX_HARDWARE: try to obtain receive time stamps in hardware
SOF_TIMESTAMPING_RX_SOFTWARE: try to obtain receive time stamps in software 1.2 SO_TIMESTAMPNS:
This option is identical to SO_TIMESTAMP except for the returned data type.
Its struct timespec allows for higher resolution (ns) timestamps than the
timeval of SO_TIMESTAMP (ms).
1.3 SO_TIMESTAMPING:
Supports multiple types of timestamp requests. As a result, this
socket option takes a bitmap of flags, not a boolean. In
err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val, &val);
val is an integer with any of the following bits set. Setting other
bit returns EINVAL and does not change the current state.
1.3.1 Timestamp Generation
Some bits are requests to the stack to try to generate timestamps. Any
combination of them is valid. Changes to these bits apply to newly
created packets, not to packets already in the stack. As a result, it
is possible to selectively request timestamps for a subset of packets
(e.g., for sampling) by embedding an send() call within two setsockopt
calls, one to enable timestamp generation and one to disable it.
Timestamps may also be generated for reasons other than being
requested by a particular socket, such as when receive timestamping is
enabled system wide, as explained earlier.
SOF_TIMESTAMPING_RX_HARDWARE:
Request rx timestamps generated by the network adapter.
SOF_TIMESTAMPING_RX_SOFTWARE:
Request rx timestamps when data enters the kernel. These timestamps
are generated just after a device driver hands a packet to the
kernel receive stack.
SOF_TIMESTAMPING_TX_HARDWARE:
Request tx timestamps generated by the network adapter.
SOF_TIMESTAMPING_TX_SOFTWARE:
Request tx timestamps when data leaves the kernel. These timestamps
are generated in the device driver as close as possible, but always
prior to, passing the packet to the network interface. Hence, they
require driver support and may not be available for all devices.
SOF_TIMESTAMPING_TX_SCHED:
Request tx timestamps prior to entering the packet scheduler. Kernel
transmit latency is, if long, often dominated by queuing delay. The
difference between this timestamp and one taken at
SOF_TIMESTAMPING_TX_SOFTWARE will expose this latency independent
of protocol processing. The latency incurred in protocol
processing, if any, can be computed by subtracting a userspace
timestamp taken immediately before send() from this timestamp. On
machines with virtual devices where a transmitted packet travels
through multiple devices and, hence, multiple packet schedulers,
a timestamp is generated at each layer. This allows for fine
grained measurement of queuing delay.
SOF_TIMESTAMPING_TX_ACK:
Request tx timestamps when all data in the send buffer has been
acknowledged. This only makes sense for reliable protocols. It is
currently only implemented for TCP. For that protocol, it may
over-report measurement, because the timestamp is generated when all
data up to and including the buffer at send() was acknowledged: the
cumulative acknowledgment. The mechanism ignores SACK and FACK.
1.3.2 Timestamp Reporting
The other three bits control which timestamps will be reported in a The other three bits control which timestamps will be reported in a
generated control message. If none of these bits are set or if none of generated control message. Changes to the bits take immediate
the set bits correspond to data that is available, then the control effect at the timestamp reporting locations in the stack. Timestamps
message will not be generated: are only reported for packets that also have the relevant timestamp
generation request set.
SOF_TIMESTAMPING_SOFTWARE: report systime if available SOF_TIMESTAMPING_SOFTWARE:
SOF_TIMESTAMPING_SYS_HARDWARE: report hwtimetrans if available (deprecated) Report any software timestamps when available.
SOF_TIMESTAMPING_RAW_HARDWARE: report hwtimeraw if available
It is worth noting that timestamps may be collected for reasons other SOF_TIMESTAMPING_SYS_HARDWARE:
than being requested by a particular socket with This option is deprecated and ignored.
SOF_TIMESTAMPING_[TR]X_(HARD|SOFT)WARE. For example, most drivers that
can generate hardware receive timestamps ignore
SOF_TIMESTAMPING_RX_HARDWARE. It is still a good idea to set that flag
in case future drivers pay attention.
If timestamps are reported, they will appear in a control message with SOF_TIMESTAMPING_RAW_HARDWARE:
cmsg_level==SOL_SOCKET, cmsg_type==SO_TIMESTAMPING, and a payload like Report hardware timestamps as generated by
this: SOF_TIMESTAMPING_TX_HARDWARE when available.
1.3.3 Timestamp Options
The interface supports one option
SOF_TIMESTAMPING_OPT_ID:
Generate a unique identifier along with each packet. A process can
have multiple concurrent timestamping requests outstanding. Packets
can be reordered in the transmit path, for instance in the packet
scheduler. In that case timestamps will be queued onto the error
queue out of order from the original send() calls. This option
embeds a counter that is incremented at send() time, to order
timestamps within a flow.
This option is implemented only for transmit timestamps. There, the
timestamp is always looped along with a struct sock_extended_err.
The option modifies field ee_info to pass an id that is unique
among all possibly concurrently outstanding timestamp requests for
that socket. In practice, it is a monotonically increasing u32
(that wraps).
In datagram sockets, the counter increments on each send call. In
stream sockets, it increments with every byte.
1.4 Bytestream Timestamps
The SO_TIMESTAMPING interface supports timestamping of bytes in a
bytestream. Each request is interpreted as a request for when the
entire contents of the buffer has passed a timestamping point. That
is, for streams option SOF_TIMESTAMPING_TX_SOFTWARE will record
when all bytes have reached the device driver, regardless of how
many packets the data has been converted into.
In general, bytestreams have no natural delimiters and therefore
correlating a timestamp with data is non-trivial. A range of bytes
may be split across segments, any segments may be merged (possibly
coalescing sections of previously segmented buffers associated with
independent send() calls). Segments can be reordered and the same
byte range can coexist in multiple segments for protocols that
implement retransmissions.
It is essential that all timestamps implement the same semantics,
regardless of these possible transformations, as otherwise they are
incomparable. Handling "rare" corner cases differently from the
simple case (a 1:1 mapping from buffer to skb) is insufficient
because performance debugging often needs to focus on such outliers.
In practice, timestamps can be correlated with segments of a
bytestream consistently, if both semantics of the timestamp and the
timing of measurement are chosen correctly. This challenge is no
different from deciding on a strategy for IP fragmentation. There, the
definition is that only the first fragment is timestamped. For
bytestreams, we chose that a timestamp is generated only when all
bytes have passed a point. SOF_TIMESTAMPING_TX_ACK as defined is easy to
implement and reason about. An implementation that has to take into
account SACK would be more complex due to possible transmission holes
and out of order arrival.
On the host, TCP can also break the simple 1:1 mapping from buffer to
skbuff as a result of Nagle, cork, autocork, segmentation and GSO. The
implementation ensures correctness in all cases by tracking the
individual last byte passed to send(), even if it is no longer the
last byte after an skbuff extend or merge operation. It stores the
relevant sequence number in skb_shinfo(skb)->tskey. Because an skbuff
has only one such field, only one timestamp can be generated.
In rare cases, a timestamp request can be missed if two requests are
collapsed onto the same skb. A process can detect this situation by
enabling SOF_TIMESTAMPING_OPT_ID and comparing the byte offset at
send time with the value returned for each timestamp. It can prevent
the situation by always flushing the TCP stack in between requests,
for instance by enabling TCP_NODELAY and disabling TCP_CORK and
autocork.
These precautions ensure that the timestamp is generated only when all
bytes have passed a timestamp point, assuming that the network stack
itself does not reorder the segments. The stack indeed tries to avoid
reordering. The one exception is under administrator control: it is
possible to construct a packet scheduler configuration that delays
segments from the same stream differently. Such a setup would be
unusual.
2 Data Interfaces
Timestamps are read using the ancillary data feature of recvmsg().
See `man 3 cmsg` for details of this interface. The socket manual
page (`man 7 socket`) describes how timestamps generated with
SO_TIMESTAMP and SO_TIMESTAMPNS records can be retrieved.
2.1 SCM_TIMESTAMPING records
These timestamps are returned in a control message with cmsg_level
SOL_SOCKET, cmsg_type SCM_TIMESTAMPING, and payload of type
struct scm_timestamping { struct scm_timestamping {
struct timespec systime; struct timespec ts[3];
struct timespec hwtimetrans;
struct timespec hwtimeraw;
}; };
recvmsg() can be used to get this control message for regular incoming The structure can return up to three timestamps. This is a legacy
packets. For send time stamps the outgoing packet is looped back to feature. Only one field is non-zero at any time. Most timestamps
the socket's error queue with the send time stamp(s) attached. It can are passed in ts[0]. Hardware timestamps are passed in ts[2].
be received with recvmsg(flags=MSG_ERRQUEUE). The call returns the
original outgoing packet data including all headers preprended down to
and including the link layer, the scm_timestamping control message and
a sock_extended_err control message with ee_errno==ENOMSG and
ee_origin==SO_EE_ORIGIN_TIMESTAMPING. A socket with such a pending
bounced packet is ready for reading as far as select() is concerned.
If the outgoing packet has to be fragmented, then only the first
fragment is time stamped and returned to the sending socket.
All three values correspond to the same event in time, but were ts[1] used to hold hardware timestamps converted to system time.
generated in different ways. Each of these values may be empty (= all Instead, expose the hardware clock device on the NIC directly as
zero), in which case no such value was available. If the application a HW PTP clock source, to allow time conversion in userspace and
is not interested in some of these values, they can be left blank to optionally synchronize system time with a userspace PTP stack such
avoid the potential overhead of calculating them. as linuxptp. For the PTP clock API, see Documentation/ptp/ptp.txt.
systime is the value of the system time at that moment. This 2.1.1 Transmit timestamps with MSG_ERRQUEUE
corresponds to the value also returned via SO_TIMESTAMP[NS]. If the
time stamp was generated by hardware, then this field is
empty. Otherwise it is filled in if SOF_TIMESTAMPING_SOFTWARE is
set.
hwtimeraw is the original hardware time stamp. Filled in if For transmit timestamps the outgoing packet is looped back to the
SOF_TIMESTAMPING_RAW_HARDWARE is set. No assumptions about its socket's error queue with the send timestamp(s) attached. A process
relation to system time should be made. receives the timestamps by calling recvmsg() with flag MSG_ERRQUEUE
set and with a msg_control buffer sufficiently large to receive the
relevant metadata structures. The recvmsg call returns the original
outgoing data packet with two ancillary messages attached.
hwtimetrans is always zero. This field is deprecated. It used to hold A message of cm_level SOL_IP(V6) and cm_type IP(V6)_RECVERR
hw timestamps converted to system time. Instead, expose the hardware embeds a struct sock_extended_err. This defines the error type. For
clock device on the NIC directly as a HW PTP clock source, to allow timestamps, the ee_errno field is ENOMSG. The other ancillary message
time conversion in userspace and optionally synchronize system time will have cm_level SOL_SOCKET and cm_type SCM_TIMESTAMPING. This
with a userspace PTP stack such as linuxptp. For the PTP clock API, embeds the struct scm_timestamping.
see Documentation/ptp/ptp.txt.
SIOCSHWTSTAMP, SIOCGHWTSTAMP: 2.1.1.2 Timestamp types
The semantics of the three struct timespec are defined by field
ee_info in the extended error structure. It contains a value of
type SCM_TSTAMP_* to define the actual timestamp passed in
scm_timestamping.
The SCM_TSTAMP_* types are 1:1 matches to the SOF_TIMESTAMPING_*
control fields discussed previously, with one exception. For legacy
reasons, SCM_TSTAMP_SND is equal to zero and can be set for both
SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE. It
is the first if ts[2] is non-zero, the second otherwise, in which
case the timestamp is stored in ts[0].
2.1.1.3 Fragmentation
Fragmentation of outgoing datagrams is rare, but is possible, e.g., by
explicitly disabling PMTU discovery. If an outgoing packet is fragmented,
then only the first fragment is timestamped and returned to the sending
socket.
2.1.1.4 Packet Payload
The calling application is often not interested in receiving the whole
packet payload that it passed to the stack originally: the socket
error queue mechanism is just a method to piggyback the timestamp on.
In this case, the application can choose to read datagrams with a
smaller buffer, possibly even of length 0. The payload is truncated
accordingly. Until the process calls recvmsg() on the error queue,
however, the full packet is queued, taking up budget from SO_RCVBUF.
2.1.1.5 Blocking Read
Reading from the error queue is always a non-blocking operation. To
block waiting on a timestamp, use poll or select. poll() will return
POLLERR in pollfd.revents if any data is ready on the error queue.
There is no need to pass this flag in pollfd.events. This flag is
ignored on request. See also `man 2 poll`.
2.1.2 Receive timestamps
On reception, there is no reason to read from the socket error queue.
The SCM_TIMESTAMPING ancillary data is sent along with the packet data
on a normal recvmsg(). Since this is not a socket error, it is not
accompanied by a message SOL_IP(V6)/IP(V6)_RECVERROR. In this case,
the meaning of the three fields in struct scm_timestamping is
implicitly defined. ts[0] holds a software timestamp if set, ts[1]
is again deprecated and ts[2] holds a hardware timestamp if set.
3. Hardware Timestamping configuration: SIOCSHWTSTAMP and SIOCGHWTSTAMP
Hardware time stamping must also be initialized for each device driver Hardware time stamping must also be initialized for each device driver
that is expected to do hardware time stamping. The parameter is defined in that is expected to do hardware time stamping. The parameter is defined in
@ -167,8 +372,7 @@ enum {
*/ */
}; };
3.1 Hardware Timestamping Implementation: Device Drivers
DEVICE IMPLEMENTATION
A driver which supports hardware time stamping must support the A driver which supports hardware time stamping must support the
SIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with SIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with

View File

@ -1,14 +1,20 @@
# To compile, from the source root
#
# make headers_install
# make M=documentation
# kbuild trick to avoid linker error. Can be omitted if a module is built. # kbuild trick to avoid linker error. Can be omitted if a module is built.
obj- := dummy.o obj- := dummy.o
# List of programs to build # List of programs to build
hostprogs-y := timestamping hwtstamp_config hostprogs-y := timestamping txtimestamp hwtstamp_config
# Tell kbuild to always build the programs # Tell kbuild to always build the programs
always := $(hostprogs-y) always := $(hostprogs-y)
HOSTCFLAGS_timestamping.o += -I$(objtree)/usr/include HOSTCFLAGS_timestamping.o += -I$(objtree)/usr/include
HOSTCFLAGS_txtimestamp.o += -I$(objtree)/usr/include
HOSTCFLAGS_hwtstamp_config.o += -I$(objtree)/usr/include HOSTCFLAGS_hwtstamp_config.o += -I$(objtree)/usr/include
clean: clean:
rm -f timestamping hwtstamp_config rm -f timestamping txtimestamp hwtstamp_config

View File

@ -0,0 +1,470 @@
/*
* Copyright 2014 Google Inc.
* Author: willemb@google.com (Willem de Bruijn)
*
* Test software tx timestamping, including
*
* - SCHED, SND and ACK timestamps
* - RAW, UDP and TCP
* - IPv4 and IPv6
* - various packet sizes (to test GSO and TSO)
*
* Consult the command line arguments for help on running
* the various testcases.
*
* This test requires a dummy TCP server.
* A simple `nc6 [-u] -l -p $DESTPORT` will do
*
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
*/
#include <arpa/inet.h>
#include <asm/types.h>
#include <error.h>
#include <errno.h>
#include <linux/errqueue.h>
#include <linux/if_ether.h>
#include <linux/net_tstamp.h>
#include <netdb.h>
#include <net/if.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <netinet/udp.h>
#include <netinet/tcp.h>
#include <netpacket/packet.h>
#include <poll.h>
#include <stdarg.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/select.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <sys/types.h>
#include <time.h>
#include <unistd.h>
/* command line parameters */
static int cfg_proto = SOCK_STREAM;
static int cfg_ipproto = IPPROTO_TCP;
static int cfg_num_pkts = 4;
static int do_ipv4 = 1;
static int do_ipv6 = 1;
static int cfg_payload_len = 10;
static uint16_t dest_port = 9000;
static struct sockaddr_in daddr;
static struct sockaddr_in6 daddr6;
static struct timespec ts_prev;
static void __print_timestamp(const char *name, struct timespec *cur,
uint32_t key, int payload_len)
{
if (!(cur->tv_sec | cur->tv_nsec))
return;
fprintf(stderr, " %s: %lu s %lu us (seq=%u, len=%u)",
name, cur->tv_sec, cur->tv_nsec / 1000,
key, payload_len);
if ((ts_prev.tv_sec | ts_prev.tv_nsec)) {
int64_t cur_ms, prev_ms;
cur_ms = (long) cur->tv_sec * 1000 * 1000;
cur_ms += cur->tv_nsec / 1000;
prev_ms = (long) ts_prev.tv_sec * 1000 * 1000;
prev_ms += ts_prev.tv_nsec / 1000;
fprintf(stderr, " (%+ld us)", cur_ms - prev_ms);
}
ts_prev = *cur;
fprintf(stderr, "\n");
}
static void print_timestamp_usr(void)
{
struct timespec ts;
struct timeval tv; /* avoid dependency on -lrt */
gettimeofday(&tv, NULL);
ts.tv_sec = tv.tv_sec;
ts.tv_nsec = tv.tv_usec * 1000;
__print_timestamp(" USR", &ts, 0, 0);
}
static void print_timestamp(struct scm_timestamping *tss, int tstype,
int tskey, int payload_len)
{
const char *tsname;
switch (tstype) {
case SCM_TSTAMP_SCHED:
tsname = " ENQ";
break;
case SCM_TSTAMP_SND:
tsname = " SND";
break;
case SCM_TSTAMP_ACK:
tsname = " ACK";
break;
default:
error(1, 0, "unknown timestamp type: %u",
tstype);
}
__print_timestamp(tsname, &tss->ts[0], tskey, payload_len);
}
static void __poll(int fd)
{
struct pollfd pollfd;
int ret;
memset(&pollfd, 0, sizeof(pollfd));
pollfd.fd = fd;
ret = poll(&pollfd, 1, 100);
if (ret != 1)
error(1, errno, "poll");
}
static void __recv_errmsg_cmsg(struct msghdr *msg, int payload_len)
{
struct sock_extended_err *serr = NULL;
struct scm_timestamping *tss = NULL;
struct cmsghdr *cm;
for (cm = CMSG_FIRSTHDR(msg);
cm && cm->cmsg_len;
cm = CMSG_NXTHDR(msg, cm)) {
if (cm->cmsg_level == SOL_SOCKET &&
cm->cmsg_type == SCM_TIMESTAMPING) {
tss = (void *) CMSG_DATA(cm);
} else if ((cm->cmsg_level == SOL_IP &&
cm->cmsg_type == IP_RECVERR) ||
(cm->cmsg_level == SOL_IPV6 &&
cm->cmsg_type == IPV6_RECVERR)) {
serr = (void *) CMSG_DATA(cm);
if (serr->ee_errno != ENOMSG ||
serr->ee_origin != SO_EE_ORIGIN_TIMESTAMPING) {
fprintf(stderr, "unknown ip error %d %d\n",
serr->ee_errno,
serr->ee_origin);
serr = NULL;
}
} else
fprintf(stderr, "unknown cmsg %d,%d\n",
cm->cmsg_level, cm->cmsg_type);
}
if (serr && tss)
print_timestamp(tss, serr->ee_info, serr->ee_data, payload_len);
}
static int recv_errmsg(int fd)
{
static char ctrl[1024 /* overprovision*/];
static struct msghdr msg;
struct iovec entry;
static char *data;
int ret = 0;
data = malloc(cfg_payload_len);
if (!data)
error(1, 0, "malloc");
memset(&msg, 0, sizeof(msg));
memset(&entry, 0, sizeof(entry));
memset(ctrl, 0, sizeof(ctrl));
memset(data, 0, sizeof(data));
entry.iov_base = data;
entry.iov_len = cfg_payload_len;
msg.msg_iov = &entry;
msg.msg_iovlen = 1;
msg.msg_name = NULL;
msg.msg_namelen = 0;
msg.msg_control = ctrl;
msg.msg_controllen = sizeof(ctrl);
ret = recvmsg(fd, &msg, MSG_ERRQUEUE);
if (ret == -1 && errno != EAGAIN)
error(1, errno, "recvmsg");
__recv_errmsg_cmsg(&msg, ret);
free(data);
return ret == -1;
}
static void do_test(int family, unsigned int opt)
{
char *buf;
int fd, i, val, total_len;
if (family == IPPROTO_IPV6 && cfg_proto != SOCK_STREAM) {
/* due to lack of checksum generation code */
fprintf(stderr, "test: skipping datagram over IPv6\n");
return;
}
total_len = cfg_payload_len;
if (cfg_proto == SOCK_RAW) {
total_len += sizeof(struct udphdr);
if (cfg_ipproto == IPPROTO_RAW)
total_len += sizeof(struct iphdr);
}
buf = malloc(total_len);
if (!buf)
error(1, 0, "malloc");
fd = socket(family, cfg_proto, cfg_ipproto);
if (fd < 0)
error(1, errno, "socket");
if (cfg_proto == SOCK_STREAM) {
val = 1;
if (setsockopt(fd, IPPROTO_TCP, TCP_NODELAY,
(char*) &val, sizeof(val)))
error(1, 0, "setsockopt no nagle");
if (family == PF_INET) {
if (connect(fd, (void *) &daddr, sizeof(daddr)))
error(1, errno, "connect ipv4");
} else {
if (connect(fd, (void *) &daddr6, sizeof(daddr6)))
error(1, errno, "connect ipv6");
}
}
opt |= SOF_TIMESTAMPING_SOFTWARE |
SOF_TIMESTAMPING_OPT_ID;
if (setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING,
(char *) &opt, sizeof(opt)))
error(1, 0, "setsockopt timestamping");
for (i = 0; i < cfg_num_pkts; i++) {
memset(&ts_prev, 0, sizeof(ts_prev));
memset(buf, 'a' + i, total_len);
buf[total_len - 2] = '\n';
buf[total_len - 1] = '\0';
if (cfg_proto == SOCK_RAW) {
struct udphdr *udph;
int off = 0;
if (cfg_ipproto == IPPROTO_RAW) {
struct iphdr *iph = (void *) buf;
memset(iph, 0, sizeof(*iph));
iph->ihl = 5;
iph->version = 4;
iph->ttl = 2;
iph->daddr = daddr.sin_addr.s_addr;
iph->protocol = IPPROTO_UDP;
/* kernel writes saddr, csum, len */
off = sizeof(*iph);
}
udph = (void *) buf + off;
udph->source = ntohs(9000); /* random spoof */
udph->dest = ntohs(dest_port);
udph->len = ntohs(sizeof(*udph) + cfg_payload_len);
udph->check = 0; /* not allowed for IPv6 */
}
print_timestamp_usr();
if (cfg_proto != SOCK_STREAM) {
if (family == PF_INET)
val = sendto(fd, buf, total_len, 0, (void *) &daddr, sizeof(daddr));
else
val = sendto(fd, buf, total_len, 0, (void *) &daddr6, sizeof(daddr6));
} else {
val = send(fd, buf, cfg_payload_len, 0);
}
if (val != total_len)
error(1, errno, "send");
/* wait for all errors to be queued, else ACKs arrive OOO */
usleep(50 * 1000);
__poll(fd);
while (!recv_errmsg(fd)) {}
}
if (close(fd))
error(1, errno, "close");
free(buf);
usleep(400 * 1000);
}
static void __attribute__((noreturn)) usage(const char *filepath)
{
fprintf(stderr, "\nUsage: %s [options] hostname\n"
"\nwhere options are:\n"
" -4: only IPv4\n"
" -6: only IPv6\n"
" -h: show this message\n"
" -l N: send N bytes at a time\n"
" -r: use raw\n"
" -R: use raw (IP_HDRINCL)\n"
" -p N: connect to port N\n"
" -u: use udp\n",
filepath);
exit(1);
}
static void parse_opt(int argc, char **argv)
{
int proto_count = 0;
char c;
while ((c = getopt(argc, argv, "46hl:p:rRu")) != -1) {
switch (c) {
case '4':
do_ipv6 = 0;
break;
case '6':
do_ipv4 = 0;
break;
case 'r':
proto_count++;
cfg_proto = SOCK_RAW;
cfg_ipproto = IPPROTO_UDP;
break;
case 'R':
proto_count++;
cfg_proto = SOCK_RAW;
cfg_ipproto = IPPROTO_RAW;
break;
case 'u':
proto_count++;
cfg_proto = SOCK_DGRAM;
cfg_ipproto = IPPROTO_UDP;
break;
case 'l':
cfg_payload_len = strtoul(optarg, NULL, 10);
break;
case 'p':
dest_port = strtoul(optarg, NULL, 10);
break;
case 'h':
default:
usage(argv[0]);
}
}
if (!cfg_payload_len)
error(1, 0, "payload may not be nonzero");
if (cfg_proto != SOCK_STREAM && cfg_payload_len > 1472)
error(1, 0, "udp packet might exceed expected MTU");
if (!do_ipv4 && !do_ipv6)
error(1, 0, "pass -4 or -6, not both");
if (proto_count > 1)
error(1, 0, "pass -r, -R or -u, not multiple");
if (optind != argc - 1)
error(1, 0, "missing required hostname argument");
}
static void resolve_hostname(const char *hostname)
{
struct addrinfo *addrs, *cur;
int have_ipv4 = 0, have_ipv6 = 0;
if (getaddrinfo(hostname, NULL, NULL, &addrs))
error(1, errno, "getaddrinfo");
cur = addrs;
while (cur && !have_ipv4 && !have_ipv6) {
if (!have_ipv4 && cur->ai_family == AF_INET) {
memcpy(&daddr, cur->ai_addr, sizeof(daddr));
daddr.sin_port = htons(dest_port);
have_ipv4 = 1;
}
else if (!have_ipv6 && cur->ai_family == AF_INET6) {
memcpy(&daddr6, cur->ai_addr, sizeof(daddr6));
daddr6.sin6_port = htons(dest_port);
have_ipv6 = 1;
}
cur = cur->ai_next;
}
if (addrs)
freeaddrinfo(addrs);
do_ipv4 &= have_ipv4;
do_ipv6 &= have_ipv6;
}
static void do_main(int family)
{
fprintf(stderr, "family: %s\n",
family == PF_INET ? "INET" : "INET6");
fprintf(stderr, "test SND\n");
do_test(family, SOF_TIMESTAMPING_TX_SOFTWARE);
fprintf(stderr, "test ENQ\n");
do_test(family, SOF_TIMESTAMPING_TX_SCHED);
fprintf(stderr, "test ENQ + SND\n");
do_test(family, SOF_TIMESTAMPING_TX_SCHED |
SOF_TIMESTAMPING_TX_SOFTWARE);
if (cfg_proto == SOCK_STREAM) {
fprintf(stderr, "\ntest ACK\n");
do_test(family, SOF_TIMESTAMPING_TX_ACK);
fprintf(stderr, "\ntest SND + ACK\n");
do_test(family, SOF_TIMESTAMPING_TX_SOFTWARE |
SOF_TIMESTAMPING_TX_ACK);
fprintf(stderr, "\ntest ENQ + SND + ACK\n");
do_test(family, SOF_TIMESTAMPING_TX_SCHED |
SOF_TIMESTAMPING_TX_SOFTWARE |
SOF_TIMESTAMPING_TX_ACK);
}
}
const char *sock_names[] = { NULL, "TCP", "UDP", "RAW" };
int main(int argc, char **argv)
{
if (argc == 1)
usage(argv[0]);
parse_opt(argc, argv);
resolve_hostname(argv[argc - 1]);
fprintf(stderr, "protocol: %s\n", sock_names[cfg_proto]);
fprintf(stderr, "payload: %u\n", cfg_payload_len);
fprintf(stderr, "server port: %u\n", dest_port);
fprintf(stderr, "\n");
if (do_ipv4)
do_main(PF_INET);
if (do_ipv6)
do_main(PF_INET6);
return 0;
}