API Reference¶
On this page we give a complete overview of all the primitives we expose in the main EBSP library.
Host¶
bsp_init¶
-
int
bsp_init
(const char *e_name, int argc, char **argv)¶ Initializes the BSP system.
Sets up all the BSP variables and loads the epiphany BSP program.
- Return
- 1 on success, 0 on failure
- Parameters
e_name
: A string with the srec binary name of the Epiphany programargc
: The number of input argumentsargv
: An array of strings with the input arguments
The string
e_name
must be of the formmyprogram.srec
. This function will search for the file in the same directory as the host program, and not in the current working directory.Usage example:
int main(int argc, char** argv) { bsp_init("e_program.srec", argc, argv); ... return 0; }
- Remark
- The
argc
andargv
parameters are ignored in the current implementation.
ebsp_spmd¶
-
int
ebsp_spmd
()¶ Runs the Epiphany program on the Epiphany cores.
This function will block until the BSP kernel program is finished.
- Return
- 1 on success, 0 on failure (e.g. after
bsp_abort
is called on a core)
bsp_begin¶
-
int
bsp_begin
(int nprocs)¶ Loads the BSP program onto the Epiphany cores.
Usage example:
int main(int argc, char** argv) { bsp_init("e_program.srec", argc, argv); bsp_begin(bsp_nprocs()); ... return 0; }
- Return
- 1 on success, 0 on failure
- Parameters
nprocs
: The number of processors to run on
- Remark
- The current implementation only allows
nprocs
to be a multiple of 4 on the 16-core Parallella. Other values ofnprocs
are rounded down.
bsp_end¶
-
int
bsp_end
()¶ Finalizes and cleans up the BSP program.
Usage example:
int main(int argc, char** argv) { bsp_init("e_program.srec", argc, argv); bsp_begin(bsp_nprocs()); ebsp_spmd(); bsp_end(); return 0; }
- Return
- 1 on success, 0 on failure
- Remark
- This function is different from the bsp_end function in e_bsp.h
bsp_nprocs¶
-
int
bsp_nprocs
()¶ Returns the number of available processors (Epiphany cores).
This function may be called after
bsp_init().- Return
- The number of available processors
ebsp_set_tagsize¶
Set initial tagsize for message passing.
The default tagsize is zero. This function should be called at most once, before any messages are sent. Calling this when receiving messages results in undefined behaviour.
- Parameters
tag_bytes
: A pointer to an integer containing the new tagsize, receiving the old tagsize on return.
It is not possible to send messages with different tag sizes. Doing so will result in undefined behaviour.
- Remark
- The tagsize set using this function is also used for inter-core messages.
ebsp_send_down¶
-
void
ebsp_send_down
(int pid, const void *tag, const void *payload, int nbytes)¶ Send a message to the Epiphany cores.
This is the preferred way to send initial data (for computation) to the Epiphany cores.
- Parameters
pid
: The pid of the target processortag
: A pointer to the message tagpayload
: A pointer to the data payloadnbytes
: The size of the payload in bytes
The size of the buffer pointed to by tag has to be
tagsize
, and must be the same for every message being sent.
ebsp_get_tagsize¶
Get the tagsize as set by the Epiphany program.
Use only for gathering result messages at the end of a BSP program.
- Return
- The tagsize in bytes
When ebsp_spmd() returns, the Epiphany program can have set a different tagsize which can be obtained using this function.
ebsp_qsize¶
-
void
ebsp_qsize
(int *packets, int *accum_bytes)¶ Get the amount of messages in the queue and their total size in bytes.
Use only for gathering result messages at the end of a BSP program.
- Parameters
packets
: A pointer to an integer receiving the number of messagesaccum_bytes
: The total size of the data payloads of the messages, in bytes.
ebsp_get_tag¶
-
void
ebsp_get_tag
(int *status, void *tag)¶ Peek the next message.
Use only for gathering result messages at the end of a BSP program.
- Parameters
status
: A pointer to an integer receiving the amount of bytes of the next message payload, or -1 if there are no more messages.tag
: A pointer to a buffer receiving the tag of the next message. This buffer should be large enough (ebsp_get_tagsize()).
ebsp_move¶
-
void
ebsp_move
(void *payload, int buffer_size)¶ Get the next message from the message queue and pop the message.
This will copy the payload and pop the message from the queue. The size of the payload can be obtained by calling bsp_get_tag(). If
buffer_size
is smaller than the data payload then the data is truncated.- Parameters
payload
: A pointer to a buffer receiving the data payloadbuffer_size
: The size of the buffer
Use only for gathering result messages at the end of a BSP program.
ebsp_hpmove¶
-
int
ebsp_hpmove
(void **tag_ptr_buf, void **payload_ptr_buf)¶ Get the next message, with tag, from the queue and pop the message.
This is the faster alternative of
ebsp_move(), as this function does not copy the data but returns the pointers to it.- Return
- The number of bytes of the payload data
- Parameters
tag_ptr_buf
: A pointer to a pointer receiving the location of the tagpayload_ptr_buf
: A pointer to a pointer receiving the location of the data pyaload
Use only for gathering result messages at the end of a BSP program.
bsp_stream_create¶
-
void *
bsp_stream_create
(int stream_size, int token_size, const void *initial_data)¶ Creates a generic stream for streaming data to or from an Epiphany core.
The function returns NULL on failure.
- Return
- A pointer to a section of external memory storing the tokens.
- Parameters
stream_size
: The total number of bytes of data in the stream.token_size
: The size in bytes of a single token. Must be at least 16.initial_data
: (Optional) The data which should be streamed to an Epiphany core.
If
initial_data
is nonzero, it is copied to the stream (stream_size
bytes). Ifinitial_data
is zero, an empty stream of sizestream_size
is created. In this case,stream_size
should be the maximum number of bytes that will be sent up from the Epiphany cores to the host.This function prints an error if
token_size
is less than 16.The format of the data pointed to by the return value is as follows: Before every token, there are two integers that specify the size of the preceding token and the size of the token itself.
00000000, nextsize, data, prevsize, nextsize, data, … prevsize, nextsize, data, prevsize, 00000000
So a header consists of two integers (8 byte total). The two sizes do NOT include these headers. They are only the size of the data inbetween.
If you want to use the returned pointer directly you have to manually take care of this data format.
- Remark
- If
initial_data
is nonzero, the data is copied so that after the call it can safely be freed or overwritten by the user.
ebsp_write¶
-
int
ebsp_write
(int pid, void *src, off_t dst, int size)¶ Write data to the Epiphany processor.
This is an alternative to the BSP Message Passing system.
- Return
- 1 on success, 0 on failure
- Parameters
pid
: The pid of the target processorsrc
: A pointer to the source datadst
: The destination address (as seen by the Epiphany core)size
: The amount of bytes to be copied
ebsp_read¶
-
int
ebsp_read
(int pid, off_t src, void *dst, int size)¶ Read data from the Epiphany processor.
This is an alternative to the BSP Message Passing system.
- Return
- 1 on success, 0 on failure
- Parameters
pid
: The pid of the source processorsrc
: The source address (as seen by the Epiphany core)dst
: A pointer to a buffer receiving the datasize
: The amount of bytes to be copied
ebsp_set_sync_callback¶
-
void
ebsp_set_sync_callback
(void (*cb)())¶ Set the (optional) callback for synchronizing epiphany cores with the host program.
This callback is called when all Epiphany cores have called ebsp_host_sync(). Note that this does not happen at bsp_sync().
- Parameters
cb
: A function pointer to the callback function
ebsp_set_end_callback¶
-
void
ebsp_set_end_callback
(void (*cb)())¶ Set the (optional) callback for finalizing.
This callback is called when
ebsp_spmd() finishes. It is primarily used by the ebsp memory inspector and should not be needed.- Parameters
cb
: A function pointer to the callback function
Epiphany¶
bsp_begin¶
-
void
bsp_begin
()¶ Denotes the start of a BSP program.
This initializes the BSP system on the core.
Must be called before calling any other BSP function. Should only be called once in a program.
bsp_end¶
-
void
bsp_end
() Denotes the end of a BSP program.
Finalizes and cleans up the BSP program. No other BSP functions are allowed to be called after this function is called.
- Remark
- Must be followed by a return statement in your main function if you want to call
ebsp_spmd()
multiple times.
bsp_nprocs¶
-
int
bsp_nprocs
() Obtain the number of Epiphany cores currently in use.
- Return
- An integer indicating the number of cores on which the program runs.
bsp_pid¶
-
int
bsp_pid
()¶ Obtain the processor identifier of the local core.
- Return
- An integer with the id of the core The processor id is an integer in the range [0, .., bsp_nprocs() - 1].
bsp_time¶
-
float
bsp_time
()¶ Obtain the time in seconds since bsp_begin() was called.
The native Epiphany timer does not support time differences longer than
UINT_MAX/(600000000)
which is roughly 7 seconds.- Return
- A floating point value with the number of elapsed seconds since the call to bsp_begin()
If you want to measure longer time intervals, we suggest you use the (less accurate) ebsp_host_time().
- Remark
- Using this in combination with ebsp_raw_time() leads to unspecified behaviour, you should only use one of these in your program.
- Remark
- This uses the internal Epiphany
E_CTIMER_0
timer so the second timer can be used for other purposes.
ebsp_host_time¶
-
float
ebsp_host_time
()¶ Obtain the time in seconds since bsp_begin() was called.
This function uses the system clock of the host to obtain the elapsed time. Because of varying amounts of latency this can be very inaccurate (its precision is in the order of milliseconds), but it supports time intervals of arbitrary length.
- Return
- A floating point value with the number of seconds since bsp_begin()
ebsp_raw_time¶
-
unsigned int
ebsp_raw_time
()¶ Obtain the number of clockcycles that have passed since the previous call to ebsp_raw_time().
This function has less overhead than bsp_time.
- Return
- An unsigned integer with the number of clockcycles
Divide the number of clockcycles by 600 000 000 to get the time in seconds.
- Remark
- Using this in combination with bsp_time() leads to unspecified behaviour, you should only use one of these in your program.
- Remark
- This uses the internal Epiphany
E_CTIMER_0
timer so the second timer can be used for other purposes.
bsp_sync¶
-
void
bsp_sync
()¶ Denotes the end of a superstep, and performs all outstanding communications and registrations.
Serves as a blocking barrier which halts execution until all Epiphany cores are finished with the current superstep.
If only a synchronization is required, and you do not want the outstanding communications and registrations to be resolved, then we suggest you use the more efficient function ebsp_barrier()
ebsp_barrier¶
-
void
ebsp_barrier
()¶ Synchronizes cores without resolving outstanding communication.
This function is more efficient than bsp_sync().
bsp_push_reg¶
-
void
bsp_push_reg
(const void *variable, const int nbytes)¶ Register a variable as available for remote access.
The operation takes effect after the next call to
bsp_sync(). Only one registration is allowed in a single superstep. When a variable is registered, every core must do so.- Parameters
variable
: A pointer to the local variablenbytes
: The size in bytes of the variable
The system maintains a stack of registered variables. Any variables registered in the same superstep are identified with each other. There is a maximum number of allowed registered variables at any given time, the specific number is platform dependent. This limit will be lifted in a future version.
Registering a variable needs to be done before it can be used with the functions bsp_put(), bsp_hpput(), bsp_get(), bsp_hpget().
Usage example:
int a, b, c, p; int x[16]; bsp_push_reg(&a, sizeof(int)); bsp_sync(); bsp_push_reg(&x, sizeof(x)); bsp_sync(); p = bsp_pid(); // Get the value of the `a` variable of core 0 and save it in `b` bsp_get(0, &a, 0, &b, sizeof(int)); // Save the value of `c` into the array `x` on core 0, at array location p bsp_put(0, &c, &x, p*sizeof(int), sizeof(int));
- Remark
- In the current implementation, the parameter nbytes is ignored. In future versions it will be used to make communication more efficient.
bsp_pop_reg¶
-
void
bsp_pop_reg
(const void *variable)¶ De-register a variable for remote memory access.
The operation takes effect after the next call to
bsp_sync(). The order in which the variables are popped does not matter.- Parameters
variable
: A pointer to the variable, which must have been previously registered with bsp_push_reg()
bsp_put¶
-
void
bsp_put
(int pid, const void *src, void *dst, int offset, int nbytes)¶ Copy data to another processor (buffered).
The data in src is copied to a buffer (currently in the inefficient external memory) at the moment bsp_put is called. Therefore the caller can replace the data in src right after bsp_put returns. When
bsp_sync() is called, the data will be transferred from the buffer to the destination at the other processor.- Parameters
pid
: The pid of the target processor (this is allowed to be the id of the sending processor)src
: A pointer to the source datadst
: A variable location that was previously registered using bsp_push_reg()offset
: The offset in bytes to be added to the remote location corresponding to the variable locationdst
nbytes
: The number of bytes to be copied
- Remark
- No warning is thrown when nbytes exceeds the size of the variable src.
- Remark
- The current implementation uses external memory which restrains the performance of this function greatly. We suggest you use bsp_hpput() wherever possible to ensure good performance.
bsp_get¶
-
void
bsp_get
(int pid, const void *src, int offset, void *dst, int nbytes)¶ Copy data from another processor (buffered)
No data transaction takes place until the next call to bsp_sync, at which point the data will be copied from source to destination.
- Parameters
pid
: The pid of the target processor (this is allowed to be the id of the sending processor)src
: A variable that has been previously registered using bsp_push_reg()dst
: A pointer to a local destinationoffset
: The offset in bytes to be added to the remote location corresponding to the variable locationsrc
nbytes
: The number of bytes to be copied
- Remark
- The official BSP standard dictates that first all the data of all bsp_get() transactions is copied into a buffer, after which all the data is written to the proper destinations. This would allow one to use bsp_get to swap to variables in place. Because of memory constraints we do not comply with the standard. In our implementation. The bsp_get() transactions are all executed at the same time, therefore such a swap would result in undefined behaviour.
- Remark
- No warning is thrown when nbytes exceeds the size of the variable src.
bsp_hpput¶
-
void
bsp_hpput
(int pid, const void *src, void *dst, int offset, int nbytes)¶ Copy data to another processor, unbuffered.
The data is immediately copied into the destination at the remote processor, as opposed to bsp_put which first copies the data to a buffer. This means the programmer must make sure that the other processor is not using the destination at this moment. The data transfer is guaranteed to be complete after the next call to
bsp_sync().- Parameters
pid
: The pid of the target processor (this is allowed to be the id of the sending processor)src
: A pointer to local source datadst
: A variable location that was previously registered using bsp_push_reg()offset
: The offset in bytes to be added to the remote location corresponding to the variable locationdst
nbytes
: The number of bytes to be copied
- Remark
- No warning is thrown when nbytes exceeds the size of the variable src.
bsp_hpget¶
-
void
bsp_hpget
(int pid, const void *src, int offset, void *dst, int nbytes)¶ Copy data from another processor.
This function is the unbuffered version of bsp_get().
As opposed to
bsp_get(), the data is transferred immediately When bsp_hpget() is called. When using this function you must make sure that the source data is available and prepared upon calling. For performance reasons, communication using this function should be preferred over buffered communication.- Parameters
pid
: The pid of the target processor (this is allowed to be the id of the sending processor)src
: A variable that has been previously registered using bsp_push_reg()dst
: A pointer to a local destinationoffset
: The offset in bytes to be added to the remote location corresponding to the variable locationsrc
nbytes
: The number of bytes to be copied
- Remark
- No warning is thrown when nbytes exceeds the size of the variable src.
bsp_set_tagsize¶
Set the tag size.
Upon return, the value pointed to by tag_bytes will contain the old tag size. The new tag size will take effect in the next superstep, so that messages sent in this superstep will have the old tag size.
- Parameters
tag_bytes
: A pointer to the tag size, in bytes
ebsp_get_tagsize¶
Obtain the tag size.
This function gets the tag size currently in use. This tagsize remains valid until the start of the next superstep.
- Return
- The tag size in bytes
bsp_send¶
-
void
bsp_send
(int pid, const void *tag, const void *payload, int nbytes)¶ Send a message to another processor.
This will send a message to the target processor, using the message passing system. The tag size can be obtained by ebsp_get_tagsize. When this function returns, the data has been copied so the user can use the buffer for other purposes.
- Parameters
pid
: The pid of the target processor (this is allowed to be the id of the sending processor)tag
: A pointer to the tag datapayload
: A pointer to the data payloadnbytes
: The size of the data payload
bsp_qsize¶
-
void
bsp_qsize
(int *packets, int *accum_bytes)¶ Obtain The number of messages in the queue and the combined size in bytes of their data.
Upon return, the integers pointed to by packets and accum_bytes will hold the number of messages in the queue, and the sum of the sizes of their data payloads respectively.
- Parameters
packets
: A pointer to an integer which will be overwritten with the number of messagesaccum_bytes
: A pointer to an integer which will be overwritten with the combined number of bytes of the message data.
bsp_get_tag¶
-
void
bsp_get_tag
(int *status, void *tag)¶ Obtain the tag and size of the next message without popping the message.
Upon return, the integer pointed to by status will receive the size of the data payload in bytes of the next message in the queue. If there is no next message it will be set to -1. The buffer pointed to by tag should be large enough to store the tag. The minimum size can be obtained by calling ebsp_get_tagsize.
- Parameters
status
: A pointer to an integer receiving the message data size in bytes.tag
: A pointer to a buffer receiving the message tag
bsp_move¶
-
void
bsp_move
(void *payload, int buffer_size)¶ Obtain the next message from the message queue and pop the message.
This will copy the payload and pop the message from the queue. The size of the payload can be obtained by calling
bsp_get_tag(). Ifbuffer_size
is smaller than the data payload then the data is truncated.- Parameters
payload
: A pointer to a buffer receiving the data payloadbuffer_size
: The size of the buffer
bsp_hpmove¶
-
int
bsp_hpmove
(void **tag_ptr_buf, void **payload_ptr_buf)¶ Obtain the next message, with tag, from the queue and pop the message.
This function will give the user direct pointers to the tag and data of the message. This avoids the data copy as done in
bsp_move().- Return
- The number of bytes of the payload data
- Parameters
tag_ptr_buf
: A pointer to a pointer receiving the location of the tagpayload_ptr_buf
: A pointer to a pointer receiving the location of the data pyaload
- Remark
- that both tag and payload can be stored in external memory. Repeated use of these tags will lead to overall worse performance, such that bsp_move() can actually outperform this variant.
bsp_stream_open¶
-
int
bsp_stream_open
(ebsp_stream *stream, int stream_id)¶ Open a stream that was created using
bsp_stream_create
on the host.The first stream created by the host will have
stream_id
0.- Return
- Nonzero if succesful.
- Parameters
stream
: Pointer to an existingbsp_stream
struct to hold the stream data. This struct can be allocated on the stack by the user.stream_id
: The index of the stream.
Usage example:
bsp_stream mystream; if( bsp_stream_open(&mystream, 3) ) { // Get some data void* buffer = 0; bsp_stream_move_down(&mystream, &buffer, 0); // The data is now in buffer // Finally, close the stream bsp_stream_close(&mystream);` }
- Remark
- This function has to be called before performing any other operation on the stream.
- Remark
- A call to the function should always match a single call to
bsp_stream_close
.
bsp_stream_close¶
-
void
bsp_stream_close
(ebsp_stream *stream)¶ Wait for pending transfers to complete and close a stream.
Behaviour is undefined if
stream
is not a handle opened bybsp_stream_open
.- Parameters
stream
: The handle of the stream, opened bybsp_stream_open
.
Cleans up the stream, and frees any buffers that may have been used by the stream.
bsp_stream_move_up¶
-
int
bsp_stream_move_up
(ebsp_stream *stream, const void *data, int data_size, int wait_for_completion)¶ Write a local token up to a stream.
The function
always waits for the previous token to have finished.- Return
- Number of bytes written. Zero if an error has occurred.
- Parameters
stream
: The handle of the streamdata
: The data to be sent up the streamdata_size
: The size of the data to be sent, i.e. the size of the token. Behaviour is undefined if it is not a multiple of 8. If it is not a multiple of 8 bytes then transfers will be slow.wait_for_completion
: If nonzero this function blocks untill the data is completely written to the stream.
If
wait_for_completion
is nonzero, this function will wait untill the data is transferred. This corresponds to single buffering.Alternativly, double buffering can be used as follows. Set
wait_for_completion
to zero and continue constructing the next token in a different buffer. Usage example:int* buf1 = ebsp_malloc(100 * sizeof(int)); int* buf2 = ebsp_malloc(100 * sizeof(int)); int* curbuf = buf1; int* otherbuf = buf2; ebsp_stream s; bsp_stream_open(&s, 0); // open stream 0 while (...) { // Fill curbuf for (int i = 0; i < 100; i++) curbuf[i] = 5; // Send up bsp_stream_move_up(&s, curbuf, 100 * sizeof(int), 0); // Use other buffer swap(curbuf, otherbuf); } ebsp_free(buf1); ebsp_free(buf2);
- Remark
- Behaviour is undefined if the stream was not opened using
bsp_stream_open
. - Remark
- Memory is transferred using the
DMA1
engine.
bsp_stream_move_down¶
-
int
bsp_stream_move_down
(ebsp_stream *stream, void **buffer, int preload)¶ Obtain the next token from a stream.
When calling this function, the token that was obtained at the previous call will be overwritten.
- Return
- Number of bytes of the obtained chunk. If stream has finished or an error has occurred this function will return
0
. - Parameters
stream
: The handle of the streambuffer
: Receives a pointer to a local copy of the next token.preload
: If this parameter is nonzero then the BSP system will preload the next token asynchroneously (double buffering).
- Remark
- Behaviour is undefined if the stream was not opened using
bsp_stream_open
. - Remark
- Memory is transferred using the
DMA1
engine. - Remark
- When using double buffering, the BSP system will allocate memory for the next chunk, and will start writing to it using the DMA engine while the current chunk is processed. This requires more (local) memory, but can greatly increase the overall speed.
bsp_stream_seek¶
-
void
bsp_stream_seek
(ebsp_stream *stream, int delta_tokens)¶ Move the cursor in the stream, to change the next token to be obtained.
If
delta_tokens
is out of bounds, then the cursor will be moved to the start or end of the stream respectively.bsp_stream_seek(i, INT_MIN)
will set the cursor to the startbsp_stream_seek(i, INT_MAX)
will set the cursor to the end of the stream- Parameters
stream
: The handle of the streamdelta_tokens
: The number of tokens to skip ifdelta_tokens > 0
, or to go back ifdelta_tokens < 0
.
Note that if
bsp_stream_move_down
is used withpreload
enabled (meaning the last call to that function hadpreload
enabled), then callingebsp_stream_seek
will discard any token that was preloaded in memory, so the first call toebsp_stream_move_down
after this will yield a token from the new position.- Remark
- This function provides a mechanism through which chunks can be obtained multiple times. It gives you random access in the memory in the data stream.
- Remark
- This function has
O(delta_tokens)
complexity.