API

struct augment_args

Public Members

int w
int h
float scale
float rad
float dx
float dy
float aspect
class BOX

Inherits from Structure

Private Static Attributes

list ObjectRecognition.BOX._fields_ = [("x", c_float), ("y", c_float), ("w", c_float), ("h", c_float)]
struct box

Public Members

float x
float y
float w
float h
class BOX

Inherits from Structure

Private Static Attributes

list FaceDetect.BOX._fields_ = [("x", c_float), ("y", c_float), ("w", c_float), ("h", c_float)]
struct box_label

Public Members

int id
float x
float y
float w
float h
float left
float right
float top
float bottom
struct CalibrationParameters
#include <Core.hpp>

Intrinsic parameters of each cameras and extrinsic (translation and rotation).

Note
The calibration/rectification process, called during sl::Camera::open, is using the raw parameters defined in the SNXXX.conf file, where XXX is the ZED Serial Number. Those values may be adjusted or not by the Self-Calibration to get a proper image alignment. After sl::Camera::open is done (with or without Self-Calibration activated) success, most of the stereo parameters (except Baseline of course) should be 0 or very close to 0. It means that images after rectification process (given by retrieveImage()) are aligned as if they were taken by a “perfect” stereo camera, defined by the new CalibrationParameters.

Public Members

sl::float3 R

Rotation (using Rodrigues’ transformation) between the two sensors. Defined as ‘tilt’, ‘convergence’ and ‘roll’.

sl::float3 T

Translation between the two sensors. T.x is the distance between the two cameras (baseline) in the sl::UNIT chosen during sl::Camera::open (mm, cm, meters, inches...).

CameraParameters left_cam

Intrinsic parameters of the left camera

CameraParameters right_cam

Intrinsic parameters of the right camera

class Camera
#include <Camera.hpp>

The main class to use the ZED camera.

Camera infos

void setSVOPosition(int frame_number)

Sets the position of the SVO file to a desired frame.

Note
Works only if the camera is open in SVO playback mode.
Parameters
  • frame_number: : the number of the desired frame to be decoded.

int getSVOPosition()

Returns the current position of the SVO file.

Return
The current position in the SVO file as int (-1 if the SDK is not reading a SVO).
Note
Works only if the camera is open in SVO reading mode.

int getSVONumberOfFrames()

Returns the number of frames in the SVO file.

Return
The total number of frames in the SVO file (-1 if the SDK is not reading a SVO).
Note
Works only if the camera is open in SVO reading mode.

void setCameraSettings(CAMERA_SETTINGS settings, int value, bool use_default = false)

Sets the value to the corresponding sl::CAMERA_SETTINGS (Gain, brightness, hue, exposure...).

Note
Works only if the camera is open in live mode.
Parameters
  • settings: : enum for the control mode.
  • value: : value to set for the corresponding control.
  • use_default: : will set default (or automatic) value if set to true (value (int) will not be taken into account).

int getCameraSettings(CAMERA_SETTINGS setting)

Returns the current value to the corresponding sl::CAMERA_SETTINGS (Gain, brightness, hue, exposure...).

Return
The current value for the corresponding control (-1 if something wrong happened).
Note
Works only if the camera is open in live mode.
Parameters
  • setting: : enum for the control mode.

float getCameraFPS()

Returns the current FPS of the camera.

Return
The current FPS (or recorded FPS for SVO). Return -1.f if something goes wrong.

void setCameraFPS(int desired_fps)

Sets a new frame rate for the camera, or the closest available frame rate.

Note
Works only if the camera is open in live mode.
Parameters
  • desired_fps: : the new desired frame rate.

float getCurrentFPS()

Returns the current FPS of the application/callback. It is based on the difference of camera timestamps between two successful grab().

Return
The current FPS of the application (if grab leads the application) or callback (if ZED is called in a thread)

unsigned long long getCameraTimestamp()

Returns the timestamp at the time the frame has been extracted from USB stream. (should be called after a grab()).

Return
The timestamp of the frame grab in ns. -1 if not available (SVO file without compression).
Note
SVO file from SDK 1.0.0 (with compression) contains the camera timestamp for each frame.

unsigned long long getCurrentTimestamp()

Returns the current timestamp at the time the function is called. Can be compared to the camera sl::Camera::getCameraTimestamp for synchronization. Use this function to compare the current timestamp and the camera timestamp, since they have the same reference (Computer start time).

Return
The current timestamp in ns.

unsigned int getFrameDroppedCount()

Returns the number of frame dropped since sl::Camera::grab has been called for the first time. Based on camera timestamp and FPS comparison.

Return
The number of frame dropped since first sl::Camera::grab call.

CameraInformation getCameraInformation()

Returns camera informations (calibration parameters, serial number and current firmware version).

Return
CameraInformation containing the calibration parameters of the ZED, as well as serial number and firmware version It also returns the ZED Serial Number (as uint) (Live or SVO) and the ZED Firmware version (as uint), 0 if the ZED is not connected.

Self calibration

SELF_CALIBRATION_STATE getSelfCalibrationState()

Returns the current status of the self-calibration.

Return
A status code given informations about the self calibration status. For more details see sl::SELF_CALIBRATION_STATE.

void resetSelfCalibration()

Resets the self camera calibration. This function can be called at any time AFTER the sl::Camera::open function has been called. It will reset and calculate again correction for misalignment, convergence and color mismatch. It can be called after changing camera parameters without needing to restart your executable.

if no problem was encountered, the camera will use new parameters. Otherwise, it will be the old ones.

Tracking

ERROR_CODE enableTracking(TrackingParameters tracking_params = TrackingParameters ())

Initializes and start the tracking processes.

Return
sl::ERROR_CODE_FAILURE if the sl::TrackingParameters::area_file_path file wasn’t found, sl::SUCCESS otherwise.
Warning
The area localization is a beta feature, the behavior might change in the future.
Parameters
  • tracking_params: : Structure of sl::TrackingParameters, which defines specific parameters for tracking. default : Leave it empty to get best default parameters or create your own structure to change tracking parameters according to sl::TrackingParameters documentation.

sl::TRACKING_STATE getPosition(sl::Pose &camera_pose, REFERENCE_FRAME reference_frame = sl::REFERENCE_FRAME_WORLD)

Fills the position of the camera frame in the world frame and return the current state of the Tracker.

Extract Rotation Matrix : camera_pose.getRotation(); Extract Translation Vector: camera_pose.getTranslation(); Convert to Orientation / quaternion : camera_pose.getOrientation();

Note
The camera frame is positioned at the back of the left eye of the ZED.
Return
The current state of the tracking process.
Parameters
  • camera_pose: (out) : the pose containing the position of the camera (path or position) and other information (timestamp, confidence)
  • reference_frame: : defines the reference from which you want the pose to be expressed.

sl::AREA_EXPORT_STATE getAreaExportState()

Returns the state of exportation of the area database (spatial memory).

Return
The current state of the exportation of the area file.

void disableTracking(sl::String area_file_path = "")

Disables motion tracking.

Warning
This feature is still in beta, you might encounter reloading issues. Please also note that the ‘.area’ database depends on the depth map sl::SENSING_MODE chosen during the recording. The same mode must be used to reload the database.
Note
The saving is done asynchronously, the state can be get by getAreaExportState().
Parameters
  • area_file_path: (optional) : if set, save the spatial database in a ‘.area’ file. areaFilePath is the name and path of the database, e.g. : “path/to/file/myArea1.area”.

void resetTracking(sl::Transform &path)

Resets the tracking, re-initializes the path with the transformation matrix given.

Note
Please note that this function will also flush the area database built / loaded.

Spatial Mapping

ERROR_CODE enableSpatialMapping(SpatialMappingParameters spatial_mapping_parameters = SpatialMappingParameters ())

Initializes and starts the spatial mapping processes. The spatial mapping will create a geometric representation of the scene based on both tracking data and 3D point clouds. The resulting output is a sl::Mesh and can be obtained by the sl::Camera::extractWholeMesh function or with sl::Camera::retrieveMeshAsync after calling sl::Camera::requestMeshAsync.

Return
sl::SUCCESS if everything went fine, sl::ERROR_CODE_FAILURE otherwise
Warning
The tracking needs to be enabled to create a map
Warning
The performance greatly depends on the input parameters. If the mapping framerate is too slow in live mode, consider using a SVO file, or choose a coarser mesh resolution
Note
This features is using host memory (RAM) to store the 3D map, the maximum amount of available memory allowed can be tweaked using the SpatialMappingParameters.
Parameters
  • spatial_mapping_parameters: : the structure containing all the specific parameters for the spatial mapping. default : Leave it empty to get best default parameters or initialize it from a preset. For more informations, checkout the sl::SpatialMappingParameters documentation.

void pauseSpatialMapping(bool status)

Switches the pause status of the data integration mechanism for the spatial mapping.

Parameters
  • status: : if true, the integration is paused. If false, the spatial mapping is resumed.

SPATIAL_MAPPING_STATE getSpatialMappingState()

Returns the current spatial mapping state.

Return
status The current state of the spatial mapping process

ERROR_CODE extractWholeMesh(sl::Mesh &mesh)

Extracts the current mesh from the spatial mapping process.

Note
This function will return when the mesh has been created or updated. This is therefore a blocking function. You should either call it in a thread or at the end of the mapping process. Calling this function in the grab loop will block the depth and tracking computation and therefore gives bad results.
Return
sl::SUCCESS if the mesh is filled and available, otherwise sl::ERROR_CODE_FAILURE.
Parameters
  • mesh: (out) : The mesh to be filled.

void requestMeshAsync()

Starts the mesh generation process in a non blocking thread from the spatial mapping process.

Note
Only one mesh generation can be done at a time, consequently while the previous launch is not done every call will be ignored.

ERROR_CODE getMeshRequestStatusAsync()

Returns the mesh generation status, useful to after calling requestMeshAsync.

Return
sl::SUCCESS if the mesh is ready and not yet retrieved, otherwise sl::ERROR_CODE_FAILURE.

ERROR_CODE retrieveMeshAsync(sl::Mesh &mesh)

Retrieves the generated mesh after calling requestMeshAsync.

Return
sl::SUCCESS if the mesh is retrieved, otherwise sl::ERROR_CODE_FAILURE.
Parameters
  • mesh: (out) : The mesh to be filled.

void disableSpatialMapping()

Disables the Spatial Mapping process. All the spatial mapping functions are disables, mesh cannot be retrieves after this call.

Recorder

ERROR_CODE enableRecording(sl::String video_filename, SVO_COMPRESSION_MODE compression_mode = SVO_COMPRESSION_MODE_LOSSLESS)

Creates a file for recording the current frames.

Warning
This function can be called multiple times during ZED lifetime, but if video_filename is already existing, the file will be erased.
Return
an sl::ERROR_CODE that defines if file was successfully created and can be filled with images. * sl::SUCCESS if file can be filled * sl::ERROR_CODE_SVO_RECORDING_ERROR if something wrong happens.
Parameters
  • video_filename: : can be a *.svo file or a *.avi file (detected by the suffix name provided).
  • compression_mode: : can be one of the sl::SVO_COMPRESSION_MODE enum.

sl::RecordingState record()

Records the current frame provided by grab() into the file.

Warning
grab() must be called before record() to take the last frame available. Otherwise, it will be the last grabbed frame.
Return
The recording state structure, for more details see sl::RecordingState.

void disableRecording()

Disables the recording and closes the generated file.

Public Functions

Camera()

Default constructor which creates an empty Camera.

~Camera()

Camera destructor.

void close()

Closes the camera and free the memory. Camera::open can then be called again to reset the camera if needed.

ERROR_CODE open(InitParameters init_parameters = InitParameters ())

Opens the ZED camera in the desired mode (live/SVO), sets all the defined parameters, checks hardware requirements and launch internal self calibration.

Return
An error code given informations about the internal process, if SUCCESS is returned, the camera is ready to use. Every other code indicates an error and the program should be stopped. For more details see sl::ERROR_CODE.
Parameters
  • init_parameters: : a structure containing all the individual parameters

bool isOpened()

Tests if the camera is opened and running.

Return
true if the ZED is already setup, otherwise false.

ERROR_CODE grab(RuntimeParameters rt_parameters = RuntimeParameters ())

Grabs a new image, rectifies it and computes the disparity map and optionally the depth map. The grabbing function is typically called in the main loop.

Return
An sl::SUCCESS if no problem was encountered, sl::ERROR_CODE_NOT_A_NEW_FRAME otherwise if something wrong happens
Parameters
  • rt_parameters: : a structure containing all the individual parameters.

ERROR_CODE retrieveImage(Mat &mat, VIEW view = VIEW_LEFT, MEM type = MEM_CPU)

Downloads the rectified image from the device and returns the CPU buffer. The retrieve function should be called after the function Camera::grab.

Return
SUCCESS if the method succeeded, ERROR_CODE_FAILURE if an error occurred.
Parameters
  • mat: : the Mat to store the image.
  • view: : defines the image side wanted (see sl::VIEW)
  • type: : the memory type desired. sl::MEM_CPU by default.

ERROR_CODE retrieveMeasure(Mat &mat, MEASURE measure = MEASURE_DEPTH, MEM type = MEM_CPU)

Downloads the measure (disparity, depth or confidence of disparity) from the device and returns the CPU buffer. The retrieve function should be called after the function Camera::grab.

Return
SUCCESS if the method succeeded, ERROR_CODE_FAILURE if an error occurred.
Parameters
  • mat: : the Mat to store the measures.
  • measure: : defines the type wanted, such as disparity map, depth map or the confidence (see sl::MEASURE)
  • type: : the memory type desired. sl::MEM_CPU by default.

void setConfidenceThreshold(int conf_threshold_value)

Sets a threshold for the disparity map confidence (and by extension the depth map). The function should be called before Camera::grab to be taken into account.

Parameters
  • conf_threshold_value: : a value in [1,100]. A lower value means more confidence and precision (but less density), an upper value reduces the filtering (more density, less certainty). Other value means no filtering.

int getConfidenceThreshold()

Returns the current confidence threshold value apply to the disparity map (and by extension the depth map).

Return
The current threshold value between 0 and 100.

CUcontext getCUDAContext()

Returns the CUDA context used for all the computation.

Return
The CUDA context created by the inner process.

Resolution getResolution()

Returns the current image size.

Return
The image resolution.

void setDepthMaxRangeValue(float depth_max_range)

Sets the maximum distance of depth/disparity estimation (all values after this limit will be reported as TOO_FAR value).

Parameters
  • depth_max_range: : maximum distance in the defined sl::UNIT.

float getDepthMaxRangeValue()

Returns the current maximum distance of depth/disparity estimation.

Return
The current maximum distance that can be computed in the defined sl::UNIT.

float getDepthMinRangeValue()

Returns the closest measurable distance by the camera, according to the camera and the depth map parameters.

Return
The minimum distance that can be computed in the defined sl::UNIT.

Public Static Functions

static sl::String getSDKVersion()

Returns the version of the currently installed ZED SDK.

Return
The ZED SDK version as a string with the following format : MAJOR.MINOR.PATCH

static int isZEDconnected()

Checks if ZED cameras are connected, can be called before instantiating a Camera object.

Return
The number of connected ZED.
Warning
On Windows, only one ZED is accessible so this function will return 1 even if multiple ZED are connected.

static sl::ERROR_CODE sticktoCPUCore(int cpu_core)

Sticks the calling thread to a specific CPU core. This function is only available for Jetson TK1 and TX1.

Return
sl::SUCCESS if stick is OK, otherwise status error.
Warning
Function only available for Jetson. On other platform, result will be always 0 and no operations are performed.
Parameters
  • cpuCore: : int that defines the core the thread must be run on. could be between 0 and 3. (cpu0,cpu1,cpu2,cpu3).

Private Functions

ERROR_CODE openCamera(bool)
bool nextImage(bool)
int initMemory()
bool initRectifier()

Private Members

CameraMemberHandler *h = 0
bool opened = false
struct CameraInformation
#include <Core.hpp>

Camera specific parameters.

Public Members

CalibrationParameters calibration_parameters

Intrinsic and Extrinsic stereo parameters for rectified images (default).

CalibrationParameters calibration_parameters_raw

Intrinsic and Extrinsic stereo parameters for original images (unrectified).

unsigned int serial_number = 0

camera dependent serial number.

unsigned int firmware_version = 0

current firmware version of the camera.

struct CameraParameters
#include <Core.hpp>

Intrinsic parameters of a camera.

Note
Similar to the CalibrationParameters, those parameters are taken from the settings file (SNXXX.conf) and are modified during the sl::Camera::open call (with or without Self-Calibration). Those parameters given after sl::Camera::open call, represent the “new camera matrix” that fits/defines each image taken after rectification ( through retrieveImage).
Note
fx,fy,cx,cy must be the same for Left and Right Camera once sl::Camera::open has been called. Since distortion is corrected during rectification, distortion should not be considered after sl::Camera::open call.

Public Functions

void SetUp(float focal_x, float focal_y, float center_x, float center_y)

Setups the parameter of a camera.

Parameters
  • focal_x: : horizontal focal length.
  • focal_y: : vertical focal length.
  • focal_x: : horizontal optical center.
  • focal_x: : vertical optical center.

Public Members

float fx

Focal length in pixels along x axis.

float fy

Focal length in pixels along y axis.

float cx

Optical center along x axis, defined in pixels (usually close to width/2).

float cy

Optical center along y axis, defined in pixels (usually close to height/2).

double disto[5]

Distortion factor : [ k1, k2, p1, p2, k3 ]. Radial (k1,k2,k3) and Tangential (p1,p2) distortion.

float v_fov

Vertical field of view after stereo rectification, in degrees.

float h_fov

Horizontal field of view after stereo rectification, in degrees.

float d_fov

Diagonal field of view after stereo rectification, in degrees.

Resolution image_size

size in pixels of the images given by the camera.

struct data

Public Members

int w
int h
matrix X
matrix y
int shallow
int *num_boxes
box **boxes
struct dbox

Public Members

float dx
float dy
float dw
float dh
struct detection_info

Public Members

char name[32]
int left
int right
int top
int bottom
float prob
struct float_pair

Public Members

float *x
float *y
class IMAGE

Inherits from Structure

Private Static Attributes

list FaceDetect.IMAGE._fields_ = [("w", c_int), ("h", c_int), ("c", c_int), ("data", POINTER(c_float))]
class IMAGE

Inherits from Structure

Private Static Attributes

list ObjectRecognition.IMAGE._fields_ = [("w", c_int), ("h", c_int), ("c", c_int), ("data", POINTER(c_float))]
class IMAGE

Inherits from Structure

Private Static Attributes

list darknet.IMAGE._fields_ = [("w", c_int), ("h", c_int), ("c", c_int), ("data", POINTER(c_float))]
struct image

Public Members

int w
int h
int c
float *data
class ImageClass

Public Functions

__init__(self self, name name, image_paths image_paths)
__str__(self self)
__len__(self self)

Public Members

name
image_paths
struct Indice
#include <Mesh.hpp>

Store the index per faces of the associated vertices/normals/texture coordinates.

Note
Contains data only after you call Mesh::applyTexture().

Public Members

sl::uint3 v_vn_ind

vertices and normals indices.

sl::uint3 uv_ind

texture coordinates indices.

struct InfoOption

Public Members

std::string svo_path
bool recordingMode = false
std::string output_path
bool computeDisparity = false
bool videoMode = false
class InitParameters
#include <Camera.hpp>

Parameters for ZED initialization.

A default constructor is enable.

Public Functions

InitParameters(RESOLUTION camera_resolution_ = RESOLUTION_HD720, int camera_fps_ = 0, int camera_linux_id_ = 0, sl::String svo_input_filename_ = sl::String(), bool svo_real_time_mode_ = false, DEPTH_MODE depth_mode_ = DEPTH_MODE_PERFORMANCE, UNIT coordinate_units_ = UNIT_MILLIMETER, COORDINATE_SYSTEM coordinate_system_ = COORDINATE_SYSTEM_IMAGE, bool sdk_verbose_ = false, int sdk_gpu_id_ = -1, float depth_minimum_distance_ = -1., bool camera_disable_self_calib_ = false, bool camera_image_flip_ = false, int camera_buffer_count_linux_ = 4)

Default constructor, set all parameters to their default and optimized values.

bool save(sl::String filename)

Saves the current bunch of parameters into a file.

Return
true if file was successfully saved, otherwise false.
Parameters
  • filename: : the path to the file in which the parameters will be stored.

bool load(sl::String filename)

Loads the values of the parameters contained in a file.

Return
true if the file was successfully loaded, otherwise false.
Parameters
  • filename: : the path to the file from which the parameters will be loaded.

Public Members

RESOLUTION camera_resolution

Define the chosen ZED resolution default : RESOLUTION_HD720.

int camera_fps

Requested FPS for this resolution.

set as 0 will choose the default FPS for this resolution (see User guide). default : 0

int camera_linux_id

ONLY for LINUX : if multiple ZEDs are connected, it will choose the first zed listed (if zed_linux_id=0), the second listed (if zed_linux_id=1), ...

Each ZED will create its own memory (CPU and GPU), therefore the number of ZED available will depend on the configuration of your computer. Currently not available for Windows default : 0

sl::String svo_input_filename

Path with filename to the recorded SVO file.

bool svo_real_time_mode

When enabled the timestamp is taken as reference to determine the reading framerate.

This mode simulates the live camera and consequently skipped frames if the computation framerate is too slow. default : false

UNIT coordinate_units

Define the unit for all the metric values ( depth, point cloud, tracking).

default : sl::UNIT::UNIT_MILLIMETER

COORDINATE_SYSTEM coordinate_system

Define the coordinate system of the world frame (and the camera frame as well).

This defines the order of the axis of the coordinate system. see COORDINATE_SYSTEM for more information. default : COORDINATE_SYSTEM::COORDINATE_SYSTEM_IMAGE

DEPTH_MODE depth_mode

Defines the quality of the depth map, affects the level of details and also the computation time.

default : DEPTH_MODE::DEPTH_MODE_PERFORMANCE

float depth_minimum_distance

Specify the minimum depth information that will be computed, in the sl::UNIT you previously define.

default : 70cm (-1)

Warning
The computation time is affected by the value, exponentially. The closer it gets the longer the disparity search will take. In case of limited computation power, consider increasing the value.
Note
This value is used to calculate the disparity range estimation ( in pixels). Due to metric to pixel conversion, a small difference may occur with the value returned by getDepthMinRange().

int camera_image_flip

Defines if the image are horizontally flipped.

default : 0

bool camera_disable_self_calib

If set to true, it will disable self-calibration and take the optional calibration parameters without optimizing them.

It is advised to leave it as false, so that calibration parameters can be optimized. default : false

int camera_buffer_count_linux

Set the number of buffers in the internal grabbing process.

decrease this number may reduce latency but can also produce more corrupted frames. default: 4

Warning
Linux Only, this parameter has no effect on Windows.

bool sdk_verbose

If set to true, it will output some information about the current status of initialization.

default : false

int sdk_gpu_id

Defines the graphics card on which the computation will be done.

The default value search the more powerful (most CUDA cores) usable GPU. default : -1

struct kvp

Public Members

char *key
char *val
int used
struct layer

Public Members

LAYER_TYPE type
ACTIVATION activation
COST_TYPE cost_type
void (*forward)(struct layer, struct network)
void (*backward)(struct layer, struct network)
void (*update)(struct layer, update_args)
void (*forward_gpu)(struct layer, struct network)
void (*backward_gpu)(struct layer, struct network)
void (*update_gpu)(struct layer, update_args)
int batch_normalize
int shortcut
int batch
int forced
int flipped
int inputs
int outputs
int nweights
int nbiases
int extra
int truths
int h
int w
int c
int out_h
int out_w
int out_c
int n
int max_boxes
int groups
int size
int side
int stride
int reverse
int flatten
int spatial
int pad
int sqrt
int flip
int index
int binary
int xnor
int steps
int hidden
int truth
float smooth
float dot
float angle
float jitter
float saturation
float exposure
float shift
float ratio
float learning_rate_scale
int softmax
int classes
int coords
int background
int rescore
int objectness
int does_cost
int joint
int noadjust
int reorg
int log
int tanh
float alpha
float beta
float kappa
float coord_scale
float object_scale
float noobject_scale
float mask_scale
float class_scale
int bias_match
int random
float thresh
int classfix
int absolute
int onlyforward
int stopbackward
int dontload
int dontloadscales
float temperature
float probability
float scale
char *cweights
int *indexes
int *input_layers
int *input_sizes
int *map
float *rand
float *cost
float *state
float *prev_state
float *forgot_state
float *forgot_delta
float *state_delta
float *combine_cpu
float *combine_delta_cpu
float *concat
float *concat_delta
float *binary_weights
float *biases
float *bias_updates
float *scales
float *scale_updates
float *weights
float *weight_updates
float *delta
float *output
float *squared
float *norms
float *spatial_mean
float *mean
float *variance
float *mean_delta
float *variance_delta
float *rolling_mean
float *rolling_variance
float *x
float *x_norm
float *m
float *v
float *bias_m
float *bias_v
float *scale_m
float *scale_v
float *z_cpu
float *r_cpu
float *h_cpu
float *prev_state_cpu
float *temp_cpu
float *temp2_cpu
float *temp3_cpu
float *dh_cpu
float *hh_cpu
float *prev_cell_cpu
float *cell_cpu
float *f_cpu
float *i_cpu
float *g_cpu
float *o_cpu
float *c_cpu
float *dc_cpu
float *binary_input
struct layer *input_layer
struct layer *self_layer
struct layer *output_layer
struct layer *reset_layer
struct layer *update_layer
struct layer *state_layer
struct layer *input_gate_layer
struct layer *state_gate_layer
struct layer *input_save_layer
struct layer *state_save_layer
struct layer *input_state_layer
struct layer *state_state_layer
struct layer *input_z_layer
struct layer *state_z_layer
struct layer *input_r_layer
struct layer *state_r_layer
struct layer *input_h_layer
struct layer *state_h_layer
struct layer *wz
struct layer *uz
struct layer *wr
struct layer *ur
struct layer *wh
struct layer *uh
struct layer *uo
struct layer *wo
struct layer *uf
struct layer *wf
struct layer *ui
struct layer *wi
struct layer *ug
struct layer *wg
tree *softmax_tree
size_t workspace_size
struct list

Public Members

int size
node *front
node *back
struct load_args

Public Members

int threads
char **paths
char *path
int n
int m
char **labels
int h
int w
int out_w
int out_h
int nh
int nw
int num_boxes
int min
int max
int size
int classes
int background
int scale
int center
int coords
float jitter
float angle
float aspect
float saturation
float exposure
float hue
data *d
image *im
image *resized
data_type type
tree *hierarchy
class Mat
#include <Core.hpp>

The Mat class can handle multiple matrix format from 1 to 4 channels, with different value types (float or uchar), and can be stored CPU and/or GPU side. sl::Mat is defined in a row-major order: - It means that, in the image buffer, the entire first row is stored first, followed by the entire second row, and so on. The CPU and GPU buffer aren’t automatically synchronized for performance reasons, you can use Mat::updateCPUfromGPU / Mat::updateGPUfromCPU to do it. If you are using the GPU side of the Mat object, you need to make sure to call sl::Mat::free() before destroying the sl::Camera object. The destruction of the sl::Camera object delete the CUDA context needed to free the GPU Mat memory.

Public Functions

Mat()

empty Mat default constructor.

Mat(size_t width, size_t height, MAT_TYPE mat_type, MEM memory_type = MEM_CPU)

Mat constructor.

Parameters
  • width: : width of the matrix in pixels.
  • height: : height of the matrix in pixels.
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • memory_type: : defines where the buffer will be stored. (sl::MEM_CPU and/or sl::MEM_GPU). This function directly allocates the requested memory. It calls Mat::alloc.

Mat(size_t width, size_t height, MAT_TYPE mat_type, sl::uchar1 *ptr, size_t step, MEM memory_type = MEM_CPU)

Mat constructor from an existing data pointer.

Parameters
  • width: : width of the matrix in pixels.
  • height: : height of the matrix in pixels.
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • ptr: : pointer to the data array. (CPU or GPU).
  • step: : step of the data array. (the Bytes size of one pixel row)
  • memory_type: : defines where the buffer will be stored. (sl::MEM_CPU and/or sl::MEM_GPU). This function doesn’t allocate the memory.

Mat(size_t width, size_t height, MAT_TYPE mat_type, sl::uchar1 *ptr_cpu, size_t step_cpu, sl::uchar1 *ptr_gpu, size_t step_gpu)

Mat constructor from two existing data pointers, CPU and GPU.

Parameters
  • width: : width of the matrix in pixels.
  • height: : height of the matrix in pixels.
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • ptr_cpu: : CPU pointer to the data array.
  • step_cpu: : step of the CPU data array. (the Bytes size of one pixel row)
  • ptr_gpu: : GPU pointer to the data array.
  • step_gpu: : step of the GPU data array. (the Bytes size of one pixel row) This function doesn’t allocate the memory.

Mat(sl::Resolution resolution, MAT_TYPE mat_type, MEM memory_type = MEM_CPU)

Mat constructor.

Parameters
  • resolution: : the size of the matrix in pixels.
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • memory_type: : defines where the buffer will be stored. (sl::MEM_CPU and/or sl::MEM_GPU). This function directly allocates the requested memory. It calls Mat::alloc.

Mat(sl::Resolution resolution, MAT_TYPE mat_type, sl::uchar1 *ptr, size_t step, MEM memory_type = MEM_CPU)

Mat constructor from an existing data pointer.

Parameters
  • resolution: : the size of the matrix in pixels.
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • ptr: : pointer to the data array. (CPU or GPU).
  • step: : step of the data array. (the Bytes size of one pixel row)
  • memory_type: : defines where the buffer will be stored. (sl::MEM_CPU and/or sl::MEM_GPU). This function doesn’t allocate the memory.

Mat(sl::Resolution resolution, MAT_TYPE mat_type, sl::uchar1 *ptr_cpu, size_t step_cpu, sl::uchar1 *ptr_gpu, size_t step_gpu)

Mat constructor from two existing data pointers, CPU and GPU.

Parameters
  • resolution: : the size of the matrix in pixels.
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • ptr_cpu: : CPU pointer to the data array.
  • step_cpu: : step of the CPU data array. (the Bytes size of one pixel row)
  • ptr_gpu: : GPU pointer to the data array.
  • step_gpu: : step of the GPU data array. (the Bytes size of one pixel row) This function doesn’t allocate the memory.

Mat(const sl::Mat &mat)

Mat constructor by copy (deep copy).

Parameters
  • mat: : the reference to the sl::Mat to copy. This function allocates and duplicates the data

void alloc(size_t width, size_t height, MAT_TYPE mat_type, MEM memory_type = MEM_CPU)

Allocates the Mat memory.

Warning
It erases previously allocated memory.
Parameters
  • width: : width of the matrix in pixels
  • height: : height of the matrix in pixels
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • memory_type: : defines where the buffer will be stored. (sl::MEM_CPU and/or sl::MEM_GPU).

void alloc(sl::Resolution resolution, MAT_TYPE mat_type, MEM memory_type = MEM_CPU)

Allocates the Mat memory.

Warning
It erases previously allocated memory.
Parameters
  • resolution: : the size of the matrix in pixels.
  • mat_type: : the type of the matrix (sl::MAT_TYPE_32F_C1,sl::MAT_TYPE_8U_C4...)
  • memory_type: : defines where the buffer will be stored. (sl::MEM_CPU and/or sl::MEM_GPU).

~Mat()

Mat destructor. This function calls Mat::free to release owned memory.

void free(MEM memory_type = MEM_CPU|MEM_GPU)

Free the owned memory.

Parameters
  • memory_type: : specify whether you want to free the sl::MEM_CPU and/or sl::MEM_GPU memory.

Mat &operator=(const Mat &that)

Performs a shallow copy. This function doesn’t copy the data array, it only copies the pointer.

Return
The new sl::Mat object which point to the same data as that.
Parameters
  • that: : the sl::Mat to be copied.

ERROR_CODE updateCPUfromGPU()

Downloads data from DEVICE (GPU) to HOST (CPU), if possible.

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Note
If no CPU or GPU memory are available for this Mat, some are directly allocated.
Note
If verbose sets, you have informations in case of failure.

ERROR_CODE updateGPUfromCPU()

Uploads data from HOST (CPU) to DEVICE (GPU), if possible.

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Note
If no CPU or GPU memory are available for this Mat, some are directly allocated.
Note
If verbose sets, you have informations in case of failure.

ERROR_CODE copyTo(Mat &dst, COPY_TYPE cpyType = COPY_TYPE_CPU_CPU) const

Copies data an other Mat (deep copy).

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Note
If the destination is not allocated or has a not a compatible sl::MAT_TYPE or sl::Resolution, current memory is freed and new memory is directly allocated.
Parameters
  • dst: : the Mat where the data will be copied.
  • cpyType: : specify the memories that will be used for the copy.

ERROR_CODE setFrom(const Mat &src, COPY_TYPE cpyType = COPY_TYPE_CPU_CPU)

Copies data from an other Mat (deep copy).

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Note
If the current Mat is not allocated or has a not a compatible sl::MAT_TYPE or sl::Resolution with the source, current memory is freed and new memory is directly allocated.
Parameters
  • src: : the Mat where the data will be copied from.
  • cpyType: : specify the memories that will be used for the update.

ERROR_CODE read(const char *filePath)

Reads an image from a file (only if sl::MEM_CPU is available on the current sl::Mat).

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Note
Supported sl::MAT_TYPE are : sl::MAT_TYPE_8U_C1, sl::MAT_TYPE_8U_C3 and sl::MAT_TYPE_8U_C4. input files format are PNG and JPEG. verbose sets, you have informations in case of failure.
Parameters
  • filePath: : file path including the name and extension.

ERROR_CODE write(const char *filePath)

Writes the sl::Mat (only if sl::MEM_CPU is available) into a file as an image.

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Note
Supported sl::MAT_TYPE are : sl::MAT_TYPE_8U_C1, sl::MAT_TYPE_8U_C3 and sl::MAT_TYPE_8U_C4. output files format are PNG and JPEG. verbose sets, you have informations in case of failure.
Parameters
  • filePath: : file path including the name and extension.

template <typename T>
ERROR_CODE setTo(T value, MEM memory_type = MEM_CPU)

Fills the Mat with the given value.

Note
This function is templated for sl::uchar1, sl::uchar2, sl::uchar3, sl::uchar4, sl::float1, sl::float2, sl::float3, sl::float4.
Parameters
  • value: : the value to be copied all over the matrix.
  • memory_type: : defines which buffer to fill, CPU and/or GPU. This function overwrite all the matrix.

template <typename N>
ERROR_CODE setValue(size_t x, size_t y, N value, MEM memory_type = MEM_CPU)

Sets a value to a specific point in the matrix.

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Warning
Not efficient for sl::MEM_GPU, use it on sparse data.
Note
This function is templated for sl::uchar1, sl::uchar2, sl::uchar3, sl::uchar4, sl::float1, sl::float2, sl::float3, sl::float4.
Parameters
  • x: : specify the column.
  • y: : specify the row.
  • value: : the value to be set.
  • memory_type: : defines which memory will be updated.

template <typename N>
ERROR_CODE getValue(size_t x, size_t y, N *value, MEM memory_type = MEM_CPU)

Returns the value of a specific point in the matrix.

Return
sl::SUCCESS if everything went well, sl::ERROR_CODE_FAILURE otherwise.
Warning
Not efficient for sl::MEM_GPU, use it on sparse data.
Note
This function is templated for sl::uchar1, sl::uchar2, sl::uchar3, sl::uchar4, sl::float1, sl::float2, sl::float3, sl::float4.
Parameters
  • x: : specify the column
  • y: : specify the row
  • memory_type: : defines which memory should be read.

size_t getWidth() const

brief Returns the width of the matrix.

Return
The width of the matrix in pixels.

size_t getHeight() const

brief Returns the height of the matrix.

Return
The height of the matrix in pixels.

Resolution getResolution() const

brief Returns the height of the matrix.

Return
The height of the matrix in pixels.

size_t getChannels() const

brief Returns the number of values stored in one pixel.

Return
The number of values in a pixel.

MAT_TYPE getDataType() const

brief Returns the format of the matrix.

Return
The format of the current Mat.

MEM getMemoryType() const

brief Returns the type of memory (CPU and/or GPU).

Return
The type of allocated memory.

template <typename N>
N *getPtr(MEM memory_type = MEM_CPU)

brief Returns the CPU or GPU data pointer.

Return
The pointer of the Mat data.
Parameters
  • memory_type: : specify whether you want sl::MEM_CPU or sl::MEM_GPU step.

size_t getStepBytes(MEM memory_type = MEM_CPU)

brief Returns the memory step in Bytes (the Bytes size of one pixel row).

Return
The step in bytes of the specified memory.
Parameters
  • memory_type: : specify whether you want sl::MEM_CPU or sl::MEM_GPU step.

template <typename N>
size_t getStep(MEM memory_type = MEM_CPU)

brief Returns the memory step in number of elements (the number of values in one pixel row).

Return
The step in number of elements.
Parameters
  • memory_type: : specify whether you want sl::MEM_CPU or sl::MEM_GPU step.

size_t getStep(MEM memory_type = MEM_CPU)

brief Returns the memory step in number of elements (the number of values in one pixel row).

Return
The step in number of elements.
Parameters
  • memory_type: : specify whether you want sl::MEM_CPU or sl::MEM_GPU step.

size_t getPixelBytes()

brief Returns the size in bytes of one pixel.

Return
The size in bytes of a pixel.

size_t getWidthBytes()

brief Returns the size in bytes of a row.

Return
The size in bytes of a row.

sl::String getInfos()

brief Return the informations about the Mat into a sl::String.

Return
A string containing the Mat informations.

bool isInit()

brief Defines whether the Mat is initialized or not.

Return
True if current Mat has been allocated (by the constructor or therefore).

bool isMemoryOwner()

brief Returns whether the Mat is the owner of the memory it access. If not, the memory won’t be freed if the Mat is destroyed.

Return
True if the Mat is owning its memory, else false.

void clone(const Mat &src)

Duplicates Mat by copy (deep copy).

Parameters
  • src: : the reference to the Mat to copy. This function copies the data array(s), it mark the new Mat as the memory owner.

Public Members

sl::String name
bool verbose = false

Private Members

Resolution size
size_t channels = 0
size_t step_gpu = 0
size_t step_cpu = 0
size_t pixel_bytes = 0
MAT_TYPE data_type
MEM mem_type = sl::MEM_CPU
uchar1 *ptr_cpu = NULL
uchar1 *ptr_gpu = NULL
uchar1 *ptr_internal = NULL
bool init = false
bool memory_owner = false
struct matrix

Public Members

int rows
int cols
float **vals
class Matrix3f
#include <types.hpp>

The class Matrix3f represent a generic three-dimensional matrix.

sl::Matrix3f is defined in a row-major order: - It means that, in the value buffer, the entire first row is stored first, followed by the entire second row, and so on. - you can access the data with the ‘r’ ptr or by element name as : r00, r01, r02 | 0 1 2 | r10, r11, r12 <-> r| 3 4 5 | r20, r21, r21 | 6 7 8 |

Subclassed by sl::Rotation

Public Functions

Matrix3f()

Matrix3f default constructor.

Matrix3f(float data[])

Matrix3f copy constructor (deep copy).

Matrix3f(const Matrix3f &mat)

brief Matrix3f copy constructor (deep copy).

Parameters

Matrix3f operator*(const Matrix3f &mat) const

brief Gives the result of the multiplication between two Matrix3f

Matrix3f operator*(const double &scalar) const

brief Gives the result of the multiplication between a Matrix3f and a scalar.

bool operator==(const Matrix3f &mat) const

brief Test two Matrix3f equality.

bool operator!=(const Matrix3f &mat) const

brief Test two Matrix3f inequality.

float &operator()(int u, int v)

Gets access to a specific point in the Matrix3f (read / write).

Return
The value at the u, v coordinates.
Parameters
  • u: : specify the row
  • v: : specify the column

void inverse()

Sets the Matrix3f to its inverse.

void transpose()

Sets the RotationArray to its transpose.

void setIdentity()

Sets the Matrix3f to identity.

void setZeros()

Sets the Matrix3f to zero.

sl::String getInfos()

Return the components of the Matrix3f in a sl::String.

Return
A sl::String containing the components of the current Matix3f.

Public Members

float r00
float r01
float r02
float r10
float r11
float r12
float r20
float r21
float r22
float r[nbElem]
union sl::Matrix3f::@1 sl::Matrix3f::@2
sl::String matrix_name

Name of the matrix (optional).

Public Static Functions

static Matrix3f inverse(const Matrix3f &rotation)

Returns the inverse of a Matrix3f.

Return
The inverse of the given Matrix3f
Parameters
  • rotation: : the Matrix3f to compute the inverse from.

static Matrix3f transpose(const Matrix3f &rotation)

Returns the transpose of a Matrix3f.

Return
The transpose of the given Matrix3f
Parameters
  • rotation: : the Matrix3f to compute the transpose from.

static Matrix3f identity()

Creates an identity Matrix3f.

Return
A Matrix3f set to identity.

static Matrix3f zeros()

Creates a Matrix3f filled with zeros.

Return
A Matrix3f set to zero.

Public Static Attributes

const int nbElem = 9
class Matrix4f
#include <types.hpp>

The class Matrix4f represent a generic fourth-dimensional matrix.

sl::Matrix4f is defined in a row-major order: - It means that, in the value buffer, the entire first row is stored first, followed by the entire second row, and so on. - you can access the data by the ‘m’ ptr or by the element name as : r00, r01, r02, tx | 0 1 2 3 | r10, r11, r12, ty <-> m| 4 5 6 7 | r20, r21, r22, tz | 8 9 10 11 | m30, m31, m32, m33 | 12 13 14 15 |

Subclassed by sl::Transform

Public Functions

Matrix4f()

Matrix4f default constructor.

Matrix4f(float data[])

Matrix4f copy constructor (deep copy).

Matrix4f(const Matrix4f &mat)

brief Matrix4f copy constructor (deep copy).

Parameters

Matrix4f operator*(const Matrix4f &mat) const

brief Gives the result of the multiplication between two Matrix4f.

Matrix4f operator*(const double &scalar) const

brief Gives the result of the multiplication between a Matrix4f and a scalar.

bool operator==(const Matrix4f &mat) const

brief Test two Matrix4f equality.

bool operator!=(const Matrix4f &mat) const

brief Test two Matrix4f inequality.

float &operator()(int u, int v)

Gets access to a specific point in the Matrix4f (read / write).

Return
The value at the u, v coordinates.
Parameters
  • u: : specify the row.
  • v: : specify the column.

ERROR_CODE inverse()

Sets the Matrix4f to its inverse.

Return
SUCCESS if the inverse has been computed, ERROR_CODE_FAILURE is not (det = 0).

void transpose()

Sets the Matrix4f to its transpose.

void setIdentity()

Sets the Matrix4f to identity.

void setZeros()

Sets the Matrix4f to zero.

ERROR_CODE setSubMatrix3f(sl::Matrix3f input, int row = 0, int column = 0)

Sets a 3x3 Matrix inside the Matrix4f.

Note
Can be used to set the rotation matrix when the matrix4f is a pose or an isometric matrix.
Return
SUCCESS if everything went well, ERROR_CODE_FAILURE otherwise.
Parameters
  • sl::Matrix3f: : sub matrix to put inside the Matrix4f.
  • row: : index of the row to start the 3x3 block. Must be 0 or 1.
  • column: : index of the column to start the 3x3 block. Must be 0 or 1.

ERROR_CODE setSubVector3f(sl::Vector3<float> input, int column = 3)

Sets a 3x1 Vector inside the Matrix4f at the specified column index.

Note
Can be used to set the Translation/Position matrix when the matrix4f is a pose or an isometry.
Return
SUCCESS if everything went well, ERROR_CODE_FAILURE otherwise.
Parameters
  • sl::Vector3: : sub vector to put inside the Matrix4f.
  • column: : index of the column to start the 3x3 block. By default, it is the last column (translation for a Pose).

ERROR_CODE setSubVector4f(sl::Vector4<float> input, int column = 3)

Sets a 4x1 Vector inside the Matrix4f at the specified column index.

Note
Can be used to set the Translation/Position matrix when the matrix4f is a pose or an isometry.
Return
SUCCESS if everything went well, ERROR_CODE_FAILURE otherwise.
Parameters
  • sl::Vector4: : sub vector to put inside the Matrix4f.
  • column: : index of the column to start the 3x3 block. By default, it is the last column (translation for a Pose).

sl::String getInfos()

Return the components of the Matrix4f in a sl::String.

Return
A sl::String containing the components of the current Matrix4f.

Public Members

float r00
float r01
float r02
float tx
float r10
float r11
float r12
float ty
float r20
float r21
float r22
float tz
float m30
float m31
float m32
float m33
float m[nbElem]
union sl::Matrix4f::@5 sl::Matrix4f::@6
sl::String matrix_name

Name of the matrix (optional).

Public Static Functions

static Matrix4f inverse(const Matrix4f &mat)

Creates the inverse of a Matrix4f.

Return
The inverse of the given Matrix4f.
Parameters
  • rotation: : the Matrix4f to compute the inverse from.

static Matrix4f transpose(const Matrix4f &mat)

Creates the transpose of a Matrix4f.

Return
The transpose of the given Matrix4f.
Parameters
  • rotation: : the Matrix4f to compute the transpose from.

static Matrix4f identity()

Creates an identity Matrix4f.

Return
A Matrix4f set to identity.

static Matrix4f zeros()

Creates a Matrix4f filled with zeros.

Return
A Matrix4f set to zero.

Public Static Attributes

const int nbElem = 16
class Mesh
#include <Mesh.hpp>

A mesh contains the geometric data of the scene computed by the spatial mapping.

Public Functions

Mesh()

Default constructor which creates an empty Mesh.

~Mesh()

Mesh destructor.

bool filter(MeshFilterParameters params = MeshFilterParameters ())

Filters the mesh according to the given parameters.

Return
True if the filtering was successful, false otherwise.
Note
The filtering is a costly operation but the resulting mesh is significantly lighter and less noisy, the parameters can be tweaked to get a mesh that fit better the final application. For instance a collision mesh will need to have a coarser, more decimated resolution.
Parameters

bool applyTexture(MESH_TEXTURE_FORMAT texture_format = MESH_TEXTURE_RGB)

Applies texture to the mesh.

Return
True if the texturing was successful, false otherwise.
Warning
SpatialMappingParams::saveTextureData must be set as true when enabling the spatial mapping to be able to apply the textures.
Warning
The mesh should be filtered before calling this function since Mesh::filter will erased the textures, the texturing is also significantly slower on non-filtered meshes.

bool save(std::string filename, MESH_FILE_FORMAT type = MESH_FILE_OBJ)

Saves the mesh into a file.

Return
True if the file was successfully saved, false otherwise.
Note
Only the sl::MESH_FILE_OBJ support the textures recording
Parameters
  • filename: : the path and filename of the mesh
  • type: : defines the file type (extension)

bool load(const std::string filename)

Loads the mesh from a file.

Parameters
  • filename: : the path and filename of the mesh.
Return Value
  • true: if the loading was successful, false otherwise.

void clear()

Clears all the mesh data (empty the vectors).

Public Members

std::vector<sl::float3> vertices

Vertices are defined by a 3D point {x,y,z}.

std::vector<sl::uint3> triangles

Triangles (or faces) contains the index of its three vertices.

It corresponds to the 3 vertices of the triangle {v1, v2, v3}.

std::vector<sl::float3> normals

Normals are defined by three conponant, {nx, ny, nz}.

Normals are defined for each vertices (Mesh::vertices and Mesh::normals are the same size).

std::vector<sl::float2> uv

Texture coordinates defines 2D points on a texture.

std::vector<std::vector<Indice>> material_indices

Store the list of vertices index affected to each texture image.

The first vector has the same size as Mesh::textures, the size of the second vector represents the number of faces associated to this texture. By running over the second vector you can access the vertices/normals and texture coordinates of the current texture.

Note
Contains data only after you call Mesh::applyTexture().

std::vector<Texture> textures

List of textures images.

Private Members

sl::TextureImagePool im_pool
sl::CameraParameters cam_param
float min_d
float max_d
bool welded
size_t memory = 0
class MeshFilterParameters
#include <Mesh.hpp>

Parameters for the optional filtering step of a sl::Mesh.

A default constructor is enabled and set to its default parameters.

Note
Parameters can be user adjusted.

Public Types

enum FILTER

Values:

FILTER_LOW

Soft decimation and smoothing.

FILTER_MEDIUM

Decimate the number of faces and apply a soft smooth.

FILTER_HIGH

Drasticly reduce the number of faces.

Public Functions

MeshFilterParameters(FILTER filtering_ = FILTER_LOW)

Default constructor, set all parameters to their default and optimized values.

void set(FILTER filtering_ = FILTER_LOW)

Sets the filtering intensity.

Parameters

bool save(sl::String filename)

Saves the current bunch of parameters into a file.

Return
true if the file was successfully saved, otherwise false.
Parameters
  • filename: : the path to the file in which the parameters will be stored.

bool load(sl::String filename)

Loads the values of the parameters contained in a file.

Return
true if the file was successfully loaded, otherwise false.
Parameters
  • filename: : the path to the file from which the parameters will be loaded.

Public Members

FILTER filtering = FILTER::FILTER_LOW
class METADATA

Inherits from Structure

Private Static Attributes

list ObjectRecognition.METADATA._fields_ = [("classes", c_int), ("names", POINTER(c_char_p))]
struct metadata

Public Members

int classes
char **names
class METADATA

Inherits from Structure

Private Static Attributes

list FaceDetect.METADATA._fields_ = [("classes", c_int), ("names", POINTER(c_char_p))]
class METADATA

Inherits from Structure

Private Static Attributes

list darknet.METADATA._fields_ = [("classes", c_int), ("names", POINTER(c_char_p))]
struct moves

Public Members

char **data
int n
struct network

Public Members

int n
int batch
size_t *seen
int *t
float epoch
int subdivisions
layer *layers
float *output
learning_rate_policy policy
float learning_rate
float momentum
float decay
float gamma
float scale
float power
int time_steps
int step
int max_batches
float *scales
int *steps
int num_steps
int burn_in
int adam
float B1
float B2
float eps
int inputs
int outputs
int truths
int notruth
int h
int w
int c
int max_crop
int min_crop
int center
float angle
float aspect
float exposure
float saturation
float hue
int gpu_index
tree *hierarchy
float *input
float *truth
float *delta
float *workspace
int train
int index
float *cost
struct node

Public Members

void *val
struct node *next
struct node *prev
class Orientation
#include <Core.hpp>

The class Orientation is designed to contains orientation data from the tracking. sl::Orientation is a vector defined as [ox, oy, oz, ow].

Inherits from Vector4< float >

Public Functions

Orientation()

empty Orientation default constructor.

Orientation(const Orientation &orientation)

brief Orientation copy constructor (deep copy).

Parameters

Orientation(const Vector4<float> &in)

brief Orientation copy constructor (deep copy).

Parameters
  • in: : the vector to copy.

Orientation(const Rotation &rotation)

brief Orientation constructor from an Rotation. It converts the Rotation representation to the Orientation one.

Parameters

Orientation(const Translation &tr1, const Translation &tr2)

brief Orientation constructor from a vector represented by two Translation.

Parameters
  • tr1: : the first point of the vector.
  • tr2: : the second point of the vector.

float operator()(int x)

Returns the value at specific position in the Orientation.

Return
The value at the x position.
Parameters
  • x: : the position of the value

Orientation operator*(const Orientation &orientation) const

brief Multiplication operator by an Orientation.

Return
The current orientation after being multiplied by the other orientation.
Parameters
  • orientation: : the orientation.

void setRotation(const Rotation &rotation)

Sets the orientation from a Rotation.

Parameters

Rotation getRotation() const

Returns the current orientation as a Rotation.

Return
The rotation computed from the orientation data.

void setIdentity()

Sets the current Orientation to identity.

void setZeros()

Fills the current Orientation with zeros.

void normalise()

Normalizes the current Orientation.

Public Static Functions

static Orientation identity()

Creates an Orientation initialized to identity.

Return
An identity Orientation.

static Orientation zeros()

Creates an Orientation filled with zeros.

Return
An Orientation filled with zeros.

static Orientation normalise(const Orientation &orient)

Creates the normalized version of an existing Orientation.

Return
The normalized version of the Orientation.
Parameters

class Pose
#include <Camera.hpp>

The class Pose contains the Motion tracking data which gives the position and orientation of the ZED in space (orientation/quaternion, rotation matrix, translation/position) and other connected values (timestamp, confidence).

Public Functions

Pose()

Default constructor which creates an empty Pose.

Pose(const Pose &pose)

Pose constructor with deep copy.

Pose(const sl::Transform &pose_data, unsigned long long mtimestamp = 0, int mconfidence = 0)

Pose constructor with deep copy.

~Pose()

Pose destructor.

sl::Translation getTranslation()

Returns the camera translation from the pose.

Return
The translation vector of the ZED position.

sl::Orientation getOrientation()

Returns the camera orientation from the pose.

Return
The orientation vector of the ZED position.

sl::Rotation getRotation()

brief Returns the camera rotation (3x3) from the pose.

Return
The rotation matrix of the ZED position.

sl::Vector3<float> getRotationVector()

Returns the camera rotation (3x1) of the pose.

Return
The rotation vector of the ZED position.

Public Members

bool valid

boolean that indicates if tracking is activated or not. You should check that first if something wrong.

unsigned long long timestamp

Timestamp of the pose. This timestamp should be compared with the camera timestamp for synchronization.

sl::Transform pose_data

4x4 Matrix which contains the rotation (3x3) and the translation. Orientation is extracted from this transform as well.

int pose_confidence

Confidence/Quality of the pose estimation for the target frame A confidence metric of the tracking [0-100], 0 means that the tracking is lost, 100 means that the tracking can be fully trusted.

struct RecordingState
#include <defines.hpp>

Recording structure that contains information about SVO.

Public Members

bool status

status of current frame. May be true for success or false if frame could not be written in the SVO file.

double current_compression_time

compression time for the current frame in ms.

double current_compression_ratio

compression ratio (% of raw size) for the current frame.

double average_compression_time

average compression time in ms since beginning of recording.

double average_compression_ratio

compression ratio (% of raw size) since beginning of recording.

struct Resolution
#include <Core.hpp>

Width and height of an array.

Public Functions

Resolution(size_t w_ = 0, size_t h_ = 0)
size_t area()

Returns the area of the image.

Return
The number of pixels of the array.

bool operator==(const Resolution &that) const

Tests if the given sl::Resolution has the same properties.

Return
True if the sizes matches.

bool operator!=(const Resolution &that) const

Tests if the given sl::Resolution has different properties.

Return
True if the sizes are not equal.

Public Members

size_t width

array width in pixels

size_t height

array height in pixels

class Rotation
#include <Core.hpp>

The class Rotation is designed to contains rotation data from the tracking.

It inherits from the generic sl::Matrix3f

Inherits from sl::Matrix3f

Public Functions

Rotation()

empty Rotation default constructor.

Rotation(const Rotation &rotation)

brief Rotation copy constructor (deep copy).

Parameters

Rotation(const Matrix3f &mat)

brief Rotation copy constructor (deep copy).

Parameters
  • mat: : the mat to copy.

Rotation(const Orientation &orientation)

brief Rotation constructor from an Orientation. It converts the Orientation representation to the Rotation one.

Parameters

Rotation(const float angle, const Translation &axis)

brief Creates a Rotation representing the 3D rotation of angle around an arbitrary 3D axis.

Parameters
  • angle: : the rotation angle in rad.
  • axis: : the 3D axis to rotate around.

void setOrientation(const Orientation &orientation)

Sets the Rotation from an Orientation.

Parameters
  • orientation: : the Orientation containing the rotation to set.

Orientation getOrientation()

Returns the Orientation corresponding to the current Rotation.

Return
The rotation of the current orientation.

sl::Vector3<float> getRotationVector()

Returns the rotation vector (Rx,Ry,Rz) corresponding to the current Rotation (using Rodrigues’ transformation).

Return
The rotation vector .

void setRotationVector(const sl::Vector3<float> &vec_rot)

Sets the Rotation from a rotation vector (using Rodrigues’ transformation).

Parameters

class RuntimeParameters
#include <Camera.hpp>

Contains all the Camera::grab() parameters.

Public Functions

RuntimeParameters(SENSING_MODE sensing_mode_ = SENSING_MODE::SENSING_MODE_STANDARD, bool enable_depth_ = true, bool enable_point_cloud_ = true, bool move_point_cloud_to_world_frame_ = false)

Default constructor, set all parameters to their default and optimized values.

bool save(sl::String filename)

Saves the current bunch of parameters into a file.

Return
true if the file was successfully saved, otherwise false.
Parameters
  • filename: : the path to the file in which the parameters will be stored.

bool load(sl::String filename)

Loads the values of the parameters contained in a file.

Return
true if the file was successfully loaded, otherwise false.
Parameters
  • filename: : the path to the file from which the parameters will be loaded.

Public Members

SENSING_MODE sensing_mode

Defines the type of disparity map, more info : sl::SENSING_MODE definition.

bool enable_depth

Defines if the depth map should be computed.

If false, only the images are available.

bool enable_point_cloud

Defines if the point cloud should be computed (including XYZRGBA).

bool move_point_cloud_to_world_frame

Apply the current pose to the point cloud.

It means that values of point cloud will be defined in world frame (as opposite to the camera frame).

struct section

Public Members

char *type
list *options
struct size_params

Public Members

int batch
int inputs
int h
int w
int c
int index
int time_steps
network net
struct sortable_bbox

Public Members

int index
int sortable_bbox::class
float **probs
network net
char *filename
int classes
float elo
float *elos
class SpatialMappingParameters
#include <Camera.hpp>

Parameters for ZED scanning initialization.

A default constructor is enabled and set to its default parameters.

Note
Parameters can be user adjusted.

Public Types

enum RESOLUTION

List the spatial mapping resolution presets.

Values:

RESOLUTION_HIGH

Create a detail geometry, requires lots of memory.

RESOLUTION_MEDIUM

Smalls variations in the geometry will disappear, useful for big object

RESOLUTION_LOW

Keeps only huge variations of the geometry , useful outdoor.

enum RANGE

List the spatial mapping depth range presets.

Values:

RANGE_NEAR

Only depth close to the camera will be used by the spatial mapping.

RANGE_MEDIUM

Medium depth range.

RANGE_FAR

Takes into account objects that are far, useful outdoor.

typedef std::pair<float, float> interval

Public Functions

SpatialMappingParameters(RESOLUTION resolution = RESOLUTION_HIGH, RANGE range = RANGE_MEDIUM, int max_memory_usage_ = 2048, bool save_texture_ = true)

Default constructor, set all parameters to their default and optimized values.

void set(RESOLUTION resolution = RESOLUTION_HIGH)

Sets the resolution corresponding to the given sl::SpatialMappingParameters::RESOLUTION preset.

Parameters
  • resolution: : the desired sl::SpatialMappingParameters::RESOLUTION.

void set(RANGE range = RANGE_MEDIUM)

Sets the maximum value of depth corresponding to the given sl::SpatialMappingParameters::RANGE presets.

Parameters
  • range: : the desired sl::SpatialMappingParameters::RANGE.

bool save(sl::String filename)

Saves the current bunch of parameters into a file.

Return
true if the file was successfully saved, otherwise false.
Parameters
  • filename: : the path to the file in which the parameters will be stored.

bool load(sl::String filename)

Loads the values of the parameters contained in a file.

Return
true if the file was successfully loaded, otherwise false.
Parameters
  • filename: : the path to the file from which the parameters will be loaded.

Public Members

int max_memory_usage = 2048

The maximum CPU memory (in mega bytes) allocated for the meshing process (will fit your configuration in any case).

bool save_texture = true

Set to true if you want be able to apply texture to your mesh after its creation.

Note
This option will take more memory.

const interval allowed_min = std::make_pair(0.3f, 10.f)

The minimal depth value allowed by the spatial mapping.

const interval allowed_max = std::make_pair(2.f, 20.f)

The maximal depth value allowed by the spatial mapping.

interval range_meter = std::make_pair(0.7f, 5.f)

Depth integration range in meters. range_meter.first should fit interval::allowed_min. range_meter.first will be set to sl::Camera::getDepthMinRangeValue if you do not change it. range_meter.second should fit interval::allowed_max.

const interval allowed_resolution = std::make_pair(0.01f, 0.2f)

The resolutions allowed by the spatial mapping.

float resolution_meter = 0.03f

Spatial mapping resolution in meters, should fit interval::allowed_resolution.

Public Static Functions

static float get(RESOLUTION resolution = RESOLUTION_HIGH)

Return the resolution corresponding to the given sl::SpatialMappingParameters::RESOLUTION preset.

Return
The resolution in meter.
Parameters
  • resolution: : the desired sl::SpatialMappingParameters::RESOLUTION.

static float get(RANGE range = RANGE_MEDIUM)

Return the maximum value of depth corresponding to the given sl::SpatialMappingParameters::RANGE presets.

Return
The maximum value of depth.
Parameters
  • range: : the desired sl::SpatialMappingParameters::RANGE.

struct stbi_io_callbacks

Public Members

int (*read)(void *user, char *data, int size)
void (*skip)(void *user, int n)
int (*eof)(void *user)
struct Texture
#include <Mesh.hpp>

Contains information about texture images associated to the Mesh.

Public Members

std::string name

The name of the file in which the texture is saved.

sl::Mat texture

A sl::Mat that contains the data of the texture.

unsigned int indice_gl

useful for openGL binding reference (value not set by the SDK).

class TrackingParameters
#include <Camera.hpp>

Parameters for ZED tracking initialization.

A default constructor is enabled and set to its default parameters.

Note
Parameters can be user adjusted.

Public Functions

TrackingParameters(sl::Transform init_pos = sl::Transform (), bool _enable_memory = true, sl::String _area_path = sl::String())

Default constructor, set all parameters to their default and optimized values.

bool save(sl::String filename)

Saves the current bunch of parameters into a file.

Return
true if the file was successfully saved, otherwise false.
Parameters
  • filename: : the path to the file in which the parameters will be stored.

bool load(sl::String filename)

Loads the values of the parameters contained in a file.

Return
true if the file was successfully loaded, otherwise false.
Parameters
  • filename: : the path to the file from which the parameters will be loaded.

Public Members

sl::Transform initial_world_transform

Position of the camera in the world frame when camera is started.

By default it should be identity.

Note
The camera frame (defines the reference frame for the camera) is by default positioned at the world frame when tracking is started. Use this sl::Transform to place the camera frame in the world frame. default : Identity matrix

bool enable_spatial_memory

This mode enables the camera to learn and remember its surroundings.

This helps correct motion tracking drift and position different cameras relative to each other in space.

Warning
: This mode requires few resources to run and greatly improves tracking accuracy. We recommend to leave it on by default. default : true

sl::String area_file_path

Area localization mode can record and load (if areaFilePath is specified) a file that describes the surroundings.

Note
Loading an area file will start a searching phase during which the camera will try to position itself in the previously learned area.
Warning
: The area file describes a specific location. If you are using an area file describing a different location, the tracking function will continuously search for a position and may not find a correct one.
Warning
The ‘.area’ file can only be used with the same depth mode (sl::MODE) as the one used during area recording. default : NULL

class Transform
#include <Core.hpp>

The class Transform contains a 4x4 matrix that specifically contains a rotation 3x3 and a 3x1 translation.

It then contains the orientation as well. It can be used to create any type of Matrix4x4 or sl::Matrix4f that must be specifically used for handling a rotation and position information (OpenGL, Tracking...) It inherits from the generic sl::Matrix4f

Inherits from sl::Matrix4f

Public Functions

Transform()

brief Transform default constructor.

Transform(const Transform &motion)

brief Transform copy constructor (deep copy).

Parameters

Transform(const Matrix4f &mat)

brief Transform copy constructor (deep copy).

Parameters

Transform(const Rotation &rotation, const Translation &translation)

brief Transform constructor from a Rotation and a Translation.

Parameters

Transform(const Orientation &orientation, const Translation &translation)

brief Transform constructor from an Orientation and a Translation.

Parameters

void setRotation(const Rotation &rotation)

Sets the rotation of the current Transform from an Rotation.

Parameters

Rotation getRotation() const

Returns the Rotation of the current Transform.

Return
The Rotation created from the Transform values.
Warning
The given Rotation contains a copy of the Transform values. Not references.

void setTranslation(const Translation &translation)

Sets the translation of the current Transform from an Translation.

Parameters

Translation getTranslation() const

Returns the Translation of the current Transform.

Return
The Translation created from the Transform values.
Warning
The given Translation contains a copy of the Transform values. Not references.

void setOrientation(const Orientation &orientation)

Sets the orientation of the current Transform from an Orientation.

Parameters

Orientation getOrientation() const

Returns the Orientation of the current Transform.

Return
The Orientation created from the Transform values.
Warning
The given Orientation contains a copy of the Transform values. Not references.

sl::Vector3<float> getRotationVector()

Returns the vector Rotation (3x1) of the Transform.

Return
The rotation value for each axis (rx,ry,rz).

void setRotationVector(const sl::Vector3<float> &vec_rot)

Sets the Rotation 3x3 of the Transform with a 3x1 rotation vector (using Rodrigues’ transformation).

Parameters
  • vec_rot: : vector that contains the rotation value for each axis (rx,ry,rz).

class Translation
#include <Core.hpp>

The class Translation is designed to contains translation data from the tracking. sl::Translation is a vector as [tx, ty, tz]. You can access the data with the ‘t’ ptr or by element name as : tx, ty, tz <-> | 0 1 2 |.

Inherits from Vector3< float >

Public Functions

Translation()

empty Translation default constructor.

Translation(const Translation &translation)

brief Translation copy constructor (deep copy).

Parameters

Translation(float t1, float t2, float t3)

brief Translation constructor.

Parameters
  • t1: : the x translation.
  • t2: : the y translation.
  • t3: : the z translation.

Translation(Vector3<float> in)

brief Translation constructor.

Parameters
  • in: : vector.

Translation operator*(const Orientation &mat) const

brief Multiplication operator by an Orientation.

Return
The current Translation after being multiplied by the orientation.
Parameters

void normalize()

Normalizes the current translation.

float &operator()(int x)

Get the value at specific position in the Translation.

Return
The value at the x position.
Parameters
  • x: : the position of the value

Public Static Functions

static Translation normalize(const Translation &tr)

brief Get the normalized version of a given Translation.

Return
An other Translation object, which is equal to tr.normalize.
Parameters

struct tree

Public Members

int *leaf
int n
int *parent
int *child
int *group
char **name
int groups
int *group_size
int *group_offset
struct update_args

Public Members

int batch
float learning_rate
float momentum
float decay
int adam
float B1
float B2
float eps
int t
struct yolo_obj

Public Members

char darknet_path[1024]
char **names
float nms
box *boxes
float **probs
network net
namespace ctypes
namespace darknet

Functions

load_meta(f f)
load_net(cfg cfg, weights weights)
load_img(f f)
letterbox_img(im im, w w, h h)
predict(net net, im im)
classify(net net, meta meta, im im)
detect(net net, meta meta, im im)

Variables

lib = CDLL("/home/pjreddie/documents/darknet/libdarknet.so", RTLD_GLOBAL)
argtypes
restype
net = load_net("cfg/densenet.cfg", "/home/pjreddie/trained/densenet201.weights")
im = load_img("data/wolf.jpg")
meta = load_meta("cfg/imagenet1k.data")
r = classify(net, meta, im)
namespace DescribeSceneSrv

Functions

service_callback()
namespace example

Variables

string example.darknet_path = './darknet'
string example.datacfg = 'cfg/coco.data'
string example.cfgfile = 'cfg/tiny-yolo.cfg'
string example.weightfile = '../tiny-yolo.weights'
string example.filename = "/home/roboy/outputRoboy.mp4"
float example.thresh = 0.24
float example.hier_thresh = 0.5
cam = cv2.VideoCapture(filename)
ret_val
img = img.transpose(2,0,1)
fourcc = cv2.VideoWriter_fourcc(*'XVID')
outVideo = cv2.VideoWriter('outputRoboySkyfall.mp4',fourcc, 20.0, (800,533))
ok
frame = imutils.resize(img, width=800)
c
h
w
data = img.ravel()/255.0
outputs = pyyolo.detect(w, h, c, data, thresh, hier_thresh)
tuple example.p1 = (output['left'],output['top'])
tuple example.p2 = (output['right'],output['bottom'])
namespace face_detection

This module processes realsense camera input and runs face detection, alignment and pose estimation.

Module Tasks:

  • Main loop to process realsense camera input
  • Run Face Detection based MTCNN for Joint Face Detection & Alignment for each frame
  • Track Face over frames
  • Calculate Face Position in 3D coordinates
  • Send ROS msg containing face area, key points and face pose and unique ID for each detected face

Current Workarounds:

  • Tracking not implemented (no unique face id provided)
  • 3D coodinates not implemented (face region used as distance measure)
  • Function for Face Recognition also implemented in this modue for simplicity (to be put into anothe rmodule)
  • No ROS communication

Functions

detect_face_and_landmarks_mtcnn(img img)

Function to do face detection and alignment on an image.

Run face detection on the full input image using a MTCNN for Joint Detection and Alignment from here: https://github.com/pangyupo/mxnet_mtcnn_face_detection

Return
Bounding boxes bb and landmark points for eyes, nose and mouth edges.
Parameters
  • img: The RGB image

align_face_mtcnn(img img, bb bb)

Function to align detected faces.

The current implementation crops the picture given a face region. We do not use actual alignment because performance increase for face recognition is marginal and only slows down realtime performance as is also argued here: https://github.com/davidsandberg/facenet/issues/93

Return
Returns the cropped face region.
Parameters
  • img: The RGB image
  • bb: The bounding box of a face as tuple (x1, y1, x2, y2)

draw_rects(img img, bbs bbs, resize_factor resize_factor = 1)

Function to draw bounding boxes in a picture.

Given an image, the bounding boxes for the corresponding face regions are drawn. Additionally a resize_factor is used if the bounding boxes were calculated on a scaled version of the input image. Default value of the resize factor is 1, meaning bounding boxes were calculated on the same image size.

Return
Image overlayed with the bounding boxes.
Parameters
  • img: The RGB image
  • bbs: An array of bounding boxes of a face as tuple (x1, y1, x2, y2) factor to scale up bounding box size if calculated on different picture scale.

draw_landmarks(img img, points points, resize_factor resize_factor)

Function to draw feature points in a picture.

Given an image, the feature points for the corresponding faces are drawn. Additionally a resize_factor is used if the feature points were calculated on a scaled version of the input image. Default value of the resize factor is 1, meaning the feature points were calculated on the same image size.

Return
Image overlayed with the feature points
Parameters
  • img: The RGB image
  • points: An array containing arrays of feature points of a face factor to scale up bounding box size if calculated on different picture scale.

get_closest_face(bbs bbs)

Returns the closest face of all detected faces.

Current implementation uses bounding box size to compare proximity

Return
The array index of the biggest bounding box.
Parameters
  • bbs: An array of bounding boxes of a face as tuple (x1, y1, x2, y2).

face_detected(bbs bbs)

Checks whether a face is visible within certain distance.

Current implementation uses bounding box to check for proximity. Key value defined in FACE_AREA.

Return
True if a face is close enough, False otherwise
Parameters
  • bbs: An array of bounding boxes of a face as tuple (x1, y1, x2, y2).

recognize_face(face_img face_img, session session, classifier classifier)

Identifies a face using Facenet.

TODO: To be moved into other module The function calculates the 128D embeddings of a given face using facenet in this implementation: https://github.com/davidsandberg/facenet Then the embeddings are run through a SVM classifier to identify the face. The cropped image of the face region.

Return
Return the name of the face.
Parameters
  • session: The tensorflow session with FaceNet already loaded
  • classifier: The SVM classifier already loaded

load_model(model_dir model_dir, model_meta model_meta, model_content model_content)

Function to load a tensorflow model.

TODO: To be moved into other module

Return
Returns a tensorflow session
Parameters
  • model_dir: model directory
  • model_meta: meta file
  • model_content: checpoint file

get_model_filenames(model_dir model_dir)

Helper Function to load a tensorlow model.

TODO: To be moved into other module The function finds the meta_file and checkpoint within a given path

Return
Returns meta_file and checkpoint
Parameters
  • model_dir: Path where the model is stored

Variables

int face_detection.EXPECT_SIZE = 160
int face_detection.FACE_AREA = 1500
int face_detection.x_pixel = 640

Entry Point to run face detection.

Loads all data and processes realsense camera input in a loop.

int face_detection.y_pixel = 480
float face_detection.resize_factor = 0.5
bool face_detection.face_nearby = False
int face_detection.no_face_detect_counter = 0
dev = pyrs.Device( device_id=0, streams=[ pyrs.ColourStream(width=x_pixel, height=y_pixel, fps=30), pyrs.DepthStream() ])
sess = tf.Session(config=tf.ConfigProto(log_device_placement=False))
pnet
rnet
onet
int face_detection.minsize = 20
list face_detection.threshold = [0.6, 0.7, 0.7]
float face_detection.factor = 0.709
string face_detection.tree_model = "models/Tree/own.mod"
string face_detection.svm_model = "models/SVM/svm_lfw.mod"
clf = pickle.load(open(tree_model, "rb"))
string face_detection.model_dir = 'models/facenet'
meta_file
ckpt_file
session = load_model(model_dir, meta_file, ckpt_file)
graph = tf.get_default_graph()
image_batch = graph.get_tensor_by_name("input:0")
phase_train_placeholder = graph.get_tensor_by_name("phase_train:0")
embeddings = graph.get_tensor_by_name("embeddings:0")
c = cv2.cvtColor(dev.colour, cv2.COLOR_RGB2BGR)
int face_detection.d = dev.depth*dev.depth_scale*1000
img = cv2.resize(c, (int(resize_factor * x_pixel), int( resize_factor * y_pixel)))
d_img = cv2.resize(d, (int(resize_factor * x_pixel), int( resize_factor * y_pixel)))
total_boxes
points
start_recognize_face = false
draw = draw_rects(c.copy(), total_boxes, resize_factor)
namespace FaceDetect

@package FaceDetect
1. Face is detected using Dlib library
2. 68 landmarks are located on the face and circles are drawn over the landmarks.
3. The position(rect) of the face is sent to the main process throught the RectQueue.
4. Frame queue is as a future reference to send data to the main process

Functions

StartDetection(CameraQueue CameraQueue, FrameQueue FrameQueue, RectQueue RectQueue, FacepointQueue FacepointQueue, SpeakerQueue SpeakerQueue)
sample(probs probs)
c_array(ctype ctype, values values)
detect(net net, meta meta, image image, thresh thresh = .5, hier_thresh hier_thresh = .5, nms nms = .45)
draw_results(res res, img img)
Initialize()
detectObjects(frame frame, detect_net detect_net, detect_meta detect_meta)

Variables

lib = CDLL("../darknet/libdarknet.so", RTLD_GLOBAL)
argtypes
restype
predict = lib.network_predict_p
make_boxes = lib.make_boxes
free_ptrs = lib.free_ptrs
num_boxes = lib.num_boxes
make_probs = lib.make_probs
detect = lib.network_predict_p
reset_rnn = lib.reset_rnn
load_net = lib.load_network_p
free_image = lib.free_image
letterbox_image = lib.letterbox_image
load_meta = lib.get_metadata
load_image = lib.load_image_color
predict_image = lib.network_predict_image
network_detect = lib.network_detect
int FaceDetect.DONE = 0
namespace FindObjectSrv

Functions

service_callback()

Variables

int FindObjectSrv.i = 1
request = awaitwebsocket.recv()
type = json.loads(request)["type"]
FoundObjects = ObjectRects.getQueue();
bool FindObjectSrv.found = False
dictionary FindObjectSrv.coordinates = {}
dictionary FindObjectSrv.answer = {'found': found, 'coordinates': coordinates}
level
namespace FreezeModel

Functions

freeze_graph(model_folder model_folder)

Variables

dir = os.path.dirname(os.path.realpath(__file__))
parser = argparse.ArgumentParser()
type
str
help
args = parser.parse_args()
namespace make_labels

Functions

make_labels(s s)

Variables

string make_labels.font = 'futura-normal'
namespace multiprocess

Functions

TestProcess(number number)
detectFaces(FrameQueue FrameQueue, RectQueue RectQueue)
tracking(RectQueue RectQueue, TrackQueue TrackQueue)

Variables

list multiprocess.procs = []
FrameQueue = Queue();
RectQueue = Queue();
TrackQueue = Queue();
detectFaceProc = Process(target=detectFaces,args=(FrameQueue,RectQueue,))
trackProc = Process(target=tracking,args=(RectQueue,TrackQueue,))
namespace Multitracking

@package Multitracking
1. Coordinates of the detected object is send though Rects Queue.
2. Currently, MIL tracking algorithm has been employed.
3. A rectangle is drawn over the tracked object andis sent to the main process through TrackQueue

Functions

StartTracking(RectsQueue RectsQueue, TrackQueue TrackQueue)

The co-ords of the object to be tracked is received through the Rects Queue. MIL MultiTracking is used through OpenCV Contrib
Drawn rectangle is passed over the TrackQueue

namespace NewFacialFeaturesMsg

Functions

service_callback()
namespace ObjectRecognition

@package ObjectRecognition
1. This is a python wrapper for the YOLO implementation in C.
2. uses Ctypes as a way to access C module.
3. libdarknet.so is a pre compiled library which works only on Linux!

Functions

sample(probs probs)
c_array(ctype ctype, values values)
detect(net net, meta meta, image image, thresh thresh = .5, hier_thresh hier_thresh = .5, nms nms = .45)
draw_results(res res, img img)
Initialize()
detectObjects(frame frame)

Variables

int ObjectRecognition.detect_net = 0
int ObjectRecognition.detect_meta = 0
lib = CDLL("../darknet/libdarknet.so", RTLD_GLOBAL)
argtypes
restype
predict = lib.network_predict_p
make_boxes = lib.make_boxes
free_ptrs = lib.free_ptrs
num_boxes = lib.num_boxes
make_probs = lib.make_probs
detect = lib.network_predict_p
reset_rnn = lib.reset_rnn
load_net = lib.load_network_p
free_image = lib.free_image
letterbox_image = lib.letterbox_image
load_meta = lib.get_metadata
load_image = lib.load_image_color
predict_image = lib.network_predict_image
network_detect = lib.network_detect
int ObjectRecognition.DONE = 0
namespace RecogniseFace

@package RecogniseFace
1. Module responsible for Face Recognition
2. This uses a model already trained on LWF, to extract facial features
3. Another model of members from Roboy are trained
4. We use a SVM classifier to classify the detected face to match with the model of trained faces

Functions

recogniseFace(RectsQueue RectsQueue)
align_face_dlib(image image, face_box face_box, landmarkIndices landmarkIndices = AlignDlib.OUTER_EYES_AND_NOSE)
load_model(model_dir model_dir, model_meta model_meta, model_content model_content)

Function to load a tensorflow model.

TODO: To be moved into other module

Return
Returns a tensorflow session
Parameters
  • model_dir: model directory
  • model_meta: meta file
  • model_content: checpoint file

get_model_filenames(model_dir model_dir)

Helper Function to load a tensorlow model.

TODO: To be moved into other module The function finds the meta_file and checkpoint within a given path

Return
Returns meta_file and checkpoint
Parameters
  • model_dir: Path where the model is stored

get_image_paths_and_labels(dataset dataset)
get_image_paths(facedir facedir)
get_dataset(paths paths, has_class_directories has_class_directories = True)
load_data(image_paths image_paths, do_random_crop do_random_crop, do_random_flip do_random_flip, image_size image_size, do_prewhiten do_prewhiten = True)
processImage(img img, do_random_crop do_random_crop, do_random_flip do_random_flip, image_size image_size, do_prewhiten do_prewhiten = True)
to_rgb(img img)
prewhiten(x x)
crop(image image, random_crop random_crop, image_size image_size)
flip(image image, random_flip random_flip)
train(session session)

Variables

int RecogniseFace.IMAGE_SIZE = 160
int RecogniseFace.EXPECT_SIZE = 160
namespace RoboyVision

@package RoboyVision
1. This is the main module.
2. Each other components are created as seperate processes and spawned.
3. This also creates Message queues and passes them onto different processes

Functions

detectFaces(CameraQueue CameraQueue, FrameQueue FrameQueue, RectQueue RectQueue, FacePointQueue FacePointQueue, SpeakerQueue SpeakerQueue)
tracking(RectQueue RectQueue, TrackQueue TrackQueue)
speakerDetect(FacePointQueue FacePointQueue, SpeakerQueue SpeakerQueue, FrameQueue FrameQueue, VisualQueue VisualQueue)
recogniseFace(RectsQueue RectsQueue)
visualizer(CameraQueue CameraQueue, RectQueue RectQueue, FacePointQueue FacePointQueue, SpeakerQueue SpeakerQueue, FrameQueue FrameQueue, VisualQueue VisualQueue)
ObjectRecognise(CameraQueue CameraQueue, ObjectsQueue ObjectsQueue)

Variables

list RoboyVision.procs = []
CameraQueue = Queue()
FrameQueue = Queue()
RectQueue = Queue()
TrackQueue = Queue()
VisualQueue = Queue()
SpeakerQueue = Queue()
FacePointQueue = Queue()
ObjectsQueue = Queue()
detectFaceProc = \
target
detectFaces
args
SpeakerProc = \
speakerDetect
namespace RosMsgUtil

@package RosMsgUtil
1. This module has Util functions to send out ROS messages.
2. The send functions here are called from different modules.

Functions

AdvertiseNewFacialFeatures()
AdvertiseFaceCoordinates()
AdvertiseFindObject()
AdvertiseDescribeScene()
AdvertiseLookAtSpeaker()
AdvertiseContinously()
SendNewFacialFeatures(ff ff, speaking speaking, i i)
SendFaceCoordinates(id id, speaking speaking, position position, i i)
FindObject(type type, i i)
DescribeScene(i i)
LookAtSpeaker(i i)
ReceiveServiceRequests()

Variables

ws = websocket.WebSocket();
namespace setup

Variables

module = Extension('pyyolo', library_dirs=['.'], libraries=['yolo'], include_dirs=[numpy.get_include(), './darknet/include'], sources = ['module.c'])
name
version
description
ext_modules
namespace setup_gpu

Variables

module = Extension('pyyolo', library_dirs=['.', '/usr/local/cuda/lib64', '/usr/local/'], #libraries=['yolo', 'cuda', 'cudart', 'cublas', 'curand', 'cudnn'], libraries=['yolo', 'cuda', 'cudart', 'cublas', 'curand'], include_dirs=[numpy.get_include(), './darknet/include'], sources = ['module.c'])
name
version
description
ext_modules
namespace sl

ENUM to string

static std::string resolution2str(RESOLUTION res)
static std::string statusCode2str(SELF_CALIBRATION_STATE stat)
static DEPTH_MODE str2mode(std::string mode)
static std::string depthMode2str(DEPTH_MODE mode)
static std::string sensingMode2str(SENSING_MODE mode)
static std::string unit2str(UNIT unit)
static UNIT str2unit(std::string unit)
static std::string trackingState2str(TRACKING_STATE state)
static std::string spatialMappingState2str(SPATIAL_MAPPING_STATE state)

Typedefs

typedef float float1
typedef Vector2<float> float2
typedef Vector3<float> float3
typedef Vector4<float> float4
typedef unsigned char uchar1
typedef Vector2<unsigned char> uchar2
typedef Vector3<unsigned char> uchar3
typedef Vector4<unsigned char> uchar4
typedef double double1
typedef Vector2<double> double2
typedef Vector3<double> double3
typedef Vector4<double> double4
typedef unsigned int uint1
typedef Vector2<unsigned int> uint2
typedef Vector3<unsigned int> uint3
typedef Vector4<unsigned int> uint4

Enums

enum MEM

List available memory type.

Values:

MEM_CPU = 1

CPU Memory (Processor side).

MEM_GPU = 2

GPU Memory (Graphic card side).

enum COPY_TYPE

List available copy operation on Mat.

Values:

COPY_TYPE_CPU_CPU

copy data from CPU to CPU.

COPY_TYPE_CPU_GPU

copy data from CPU to GPU.

COPY_TYPE_GPU_GPU

copy data from GPU to GPU.

COPY_TYPE_GPU_CPU

copy data from GPU to CPU.

enum MAT_TYPE

List available Mat formats.

Values:

MAT_TYPE_32F_C1

float 1 channel.

MAT_TYPE_32F_C2

float 2 channels.

MAT_TYPE_32F_C3

float 3 channels.

MAT_TYPE_32F_C4

float 4 channels.

MAT_TYPE_8U_C1

unsigned char 1 channel.

MAT_TYPE_8U_C2

unsigned char 2 channels.

MAT_TYPE_8U_C3

unsigned char 3 channels.

MAT_TYPE_8U_C4

unsigned char 4 channels.

enum RESOLUTION

List available video resolutions.

Warning
Since v1.0, VGA mode has been updated to WVGA (from 640*480 to 672*376) and requires a firmware update to function (>= 1142). Firmware can be updated in the ZED Explorer.
Warning
NVIDIA Jetson boards do not support all ZED video resolutions and framerates. For more information, please read the on-line API documentation.

Values:

RESOLUTION_HD2K

2208*1242, available framerates: 15 fps.

RESOLUTION_HD1080

1920*1080, available framerates: 15, 30 fps.

RESOLUTION_HD720

1280*720, available framerates: 15, 30, 60 fps.

RESOLUTION_VGA

672*376, available framerates: 15, 30, 60, 100 fps.

RESOLUTION_LAST
enum CAMERA_SETTINGS

List available camera settings for the ZED camera (contrast, hue, saturation, gain...).

Each enum defines one of those settings.

Values:

CAMERA_SETTINGS_BRIGHTNESS

Defines the brightness control. Affected value should be between 0 and 8.

CAMERA_SETTINGS_CONTRAST

Defines the contrast control. Affected value should be between 0 and 8.

CAMERA_SETTINGS_HUE

Defines the hue control. Affected value should be between 0 and 11.

CAMERA_SETTINGS_SATURATION

Defines the saturation control. Affected value should be between 0 and 8.

CAMERA_SETTINGS_GAIN

Defines the gain control. Affected value should be between 0 and 100 for manual control. If ZED_EXPOSURE is set to -1, the gain is in auto mode too.

CAMERA_SETTINGS_EXPOSURE

Defines the exposure control. A -1 value enable the AutoExposure/AutoGain control,as the boolean parameter (default) does. Affected value should be between 0 and 100 for manual control.

CAMERA_SETTINGS_WHITEBALANCE

Defines the color temperature control. Affected value should be between 2800 and 6500 with a step of 100. A value of -1 set the AWB ( auto white balance), as the boolean parameter (default) does.

CAMERA_SETTINGS_AUTO_WHITEBALANCE

Defines the status of white balance (automatic or manual). A value of 0 disable the AWB, while 1 activate it.

CAMERA_SETTINGS_LAST
enum SELF_CALIBRATION_STATE

Status for self calibration.

Since v0.9.3, self-calibration is done in background and start in the sl::Camera::open or Reset function. You can follow the current status for the self-calibration any time once ZED object has been construct.

Values:

SELF_CALIBRATION_STATE_NOT_STARTED

Self calibration has not run yet (no sl::Camera::open or sl::Camera::resetSelfCalibration called).

SELF_CALIBRATION_STATE_RUNNING

Self calibration is currently running.

SELF_CALIBRATION_STATE_FAILED

Self calibration has finished running but did not manage to get accurate values. Old parameters are taken instead.

SELF_CALIBRATION_STATE_SUCCESS

Self calibration has finished running and did manage to get accurate values. New parameters are set.

SELF_CALIBRATION_STATE_LAST
enum DEPTH_MODE

List available depth computation modes.

Values:

DEPTH_MODE_NONE

This mode does not compute any depth map. Only rectified stereo images will be available.

DEPTH_MODE_PERFORMANCE

Fastest mode for depth computation.

DEPTH_MODE_MEDIUM

Balanced quality mode. Depth map is robust in any environment and requires medium resources for computation.

DEPTH_MODE_QUALITY

Best quality mode. Requires more compute power.

DEPTH_MODE_LAST
enum SENSING_MODE

List available depth sensing modes.

Values:

SENSING_MODE_STANDARD

This mode outputs ZED standard depth map that preserves edges and depth accuracy. Applications example: Obstacle detection, Automated navigation, People detection, 3D reconstruction.

SENSING_MODE_FILL

This mode outputs a smooth and fully dense depth map. Applications example: AR/VR, Mixed-reality capture, Image post-processing.

SENSING_MODE_LAST
enum UNIT

Enumerate for available metric unit of the depth.

Values:

UNIT_MILLIMETER
UNIT_CENTIMETER
UNIT_METER
UNIT_INCH
UNIT_FOOT
UNIT_LAST
enum COORDINATE_SYSTEM

List available coordinates systems for positional tracking and points cloud representation.

Positional tracking is provided in the given coordinates system.

Values:

COORDINATE_SYSTEM_IMAGE

Standard coordinates system in computer vision. Used in OpenCV : see here : http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html

COORDINATE_SYSTEM_LEFT_HANDED_Y_UP

Left-Handed with Y up and Z forward. Used in Unity with DirectX.

COORDINATE_SYSTEM_RIGHT_HANDED_Y_UP

Right-Handed with Y pointing up and Z backward. Used in OpenGL.

COORDINATE_SYSTEM_RIGHT_HANDED_Z_UP

Right-Handed with Z pointing up and Y forward. Used in 3DSMax.

COORDINATE_SYSTEM_LEFT_HANDED_Z_UP

Left-Handed with Z axis pointing up and X forward. Used in Unreal Engine.

COORDINATE_SYSTEM_LAST
enum MEASURE

List retrievable measures.

Values:

MEASURE_DISPARITY

Disparity map, 1 channel, FLOAT.

MEASURE_DEPTH

Depth map, 1 channel, FLOAT.

MEASURE_CONFIDENCE

Certainty/confidence of the disparity map, 1 channel, FLOAT.

MEASURE_XYZ

Point cloud, 4 channels, FLOAT, channel 4 is empty.

MEASURE_XYZRGBA

Colored point cloud, 4 channels, FLOAT, channel 4 contains color in R-G-B-A order.

MEASURE_XYZBGRA

Colored point cloud, 4 channels, FLOAT, channel 4 contains color in B-G-R-A order.

MEASURE_XYZARGB

Colored point cloud, 4 channels, FLOAT, channel 4 contains color in A-R-G-B order.

MEASURE_XYZABGR

Colored point cloud, 4 channels, FLOAT, channel 4 contains color in A-B-G-R order.

MEASURE_LAST
enum VIEW

List available views.

Values:

VIEW_LEFT

Rectified left image.

VIEW_RIGHT

Rectified right image.

VIEW_LEFT_GRAY

Rectified left grayscale image.

VIEW_RIGHT_GRAY

Rectified right grayscale image.

VIEW_LEFT_UNRECTIFIED

Raw left image.

VIEW_RIGHT_UNRECTIFIED

Raw right image.

VIEW_LEFT_UNRECTIFIED_GRAY

Raw left grayscale image.

VIEW_RIGHT_UNRECTIFIED_GRAY

Raw right grayscale image.

VIEW_SIDE_BY_SIDE

Left and right image (the image width is therefore doubled).

VIEW_DEPTH

Normalized depth image.

VIEW_CONFIDENCE

Normalized confidence image.

VIEW_LAST
enum DEPTH_FORMAT

List available file formats for saving depth maps.

Values:

DEPTH_FORMAT_PNG

PNG image format in 16bits. 32bits depth is mapped to 16bits color image to preserve the consistency of the data range.

DEPTH_FORMAT_PFM

stream of bytes, graphic image file format.

DEPTH_FORMAT_PGM

gray-scale image format.

DEPTH_FORMAT_LAST
enum POINT_CLOUD_FORMAT

List available file formats for saving point clouds.

Stores the spatial coordinates (x,y,z) of each pixel and optionally its RGB color.

Values:

POINT_CLOUD_FORMAT_XYZ_ASCII

Generic point cloud file format, without color information.

POINT_CLOUD_FORMAT_PCD_ASCII

Point Cloud Data file, with color information.

POINT_CLOUD_FORMAT_PLY_ASCII

PoLYgon file format, with color information.

POINT_CLOUD_FORMAT_VTK_ASCII

Visualization ToolKit file, without color information.

POINT_CLOUD_FORMAT_LAST
enum TRACKING_STATE

List the different states of positional tracking.

Values:

TRACKING_STATE_SEARCHING

The camera is searching for a previously known position to locate itself.

TRACKING_STATE_OK

Positional tracking is working normally.

TRACKING_STATE_OFF

Positional tracking is not enabled.

TRACKING_STATE_FPS_TOO_LOW

Effective FPS is too low to give proper results for motion tracking. Consider using PERFORMANCES parameters (DEPTH_MODE_PERFORMANCE, low camera resolution (VGA,HD720))

TRACKING_STATE_LAST
enum AREA_EXPORT_STATE

List the different states of spatial memory area export.

Values:

AREA_EXPORT_STATE_SUCCESS

The spatial memory file has been successfully created.

AREA_EXPORT_STATE_RUNNING

The spatial memory is currently written.

AREA_EXPORT_STATE_NOT_STARTED

The spatial memory file exportation has not been called.

AREA_EXPORT_STATE_FILE_EMPTY

The spatial memory contains no data, the file is empty.

AREA_EXPORT_STATE_FILE_ERROR

The spatial memory file has not been written because of a wrong file name.

AREA_EXPORT_STATE_SPATIAL_MEMORY_DISABLED

The spatial memory learning is disable, no file can be created.

AREA_EXPORT_STATE_LAST
enum REFERENCE_FRAME

Define which type of position matrix is used to store camera path and pose.

Values:

REFERENCE_FRAME_WORLD

The transform of sl::Pose will contains the motion with reference to the world frame (previously called PATH).

REFERENCE_FRAME_CAMERA

The transform of sl::Pose will contains the motion with reference to the previous camera frame (previously called POSE).

REFERENCE_FRAME_LAST
enum SPATIAL_MAPPING_STATE

Gives the spatial mapping state.

Values:

SPATIAL_MAPPING_STATE_INITIALIZING

The spatial mapping is initializing.

SPATIAL_MAPPING_STATE_OK

The depth and tracking data were correctly integrated in the fusion algorithm.

SPATIAL_MAPPING_STATE_NOT_ENOUGH_MEMORY

The maximum memory dedicated to the scanning has been reach, the mesh will no longer be updated.

SPATIAL_MAPPING_STATE_NOT_ENABLED

Camera::enableSpatialMapping() wasn’t called (or the scanning was stopped and not relaunched).

SPATIAL_MAPPING_STATE_FPS_TOO_LOW

Effective FPS is too low to give proper results for spatial mapping. Consider using PERFORMANCES parameters (DEPTH_MODE_PERFORMANCE, low camera resolution (VGA,HD720), spatial mapping low resolution)

SPATIAL_MAPPING_STATE_LAST
enum SVO_COMPRESSION_MODE

List available compression modes for SVO recording.

sl::SVO_COMPRESSION_MODE_LOSSLESS is an improvement of previous lossless compression (used in ZED Explorer), even if size may be bigger, compression time is much faster.

Values:

SVO_COMPRESSION_MODE_RAW

RAW images, no compression.

SVO_COMPRESSION_MODE_LOSSLESS

new Lossless, with PNG/ZSTD based compression : avg size = 42% of RAW).

SVO_COMPRESSION_MODE_LOSSY

new Lossy, with JPEG based compression : avg size = 22% of RAW).

SVO_COMPRESSION_MODE_LAST
enum MESH_FILE_FORMAT

Values:

MESH_FILE_PLY

Contains only vertices and faces.

MESH_FILE_PLY_BIN

Contains only vertices and faces, encoded in binary.

MESH_FILE_OBJ

Contains vertices, normals, faces and textures informations if possible.

MESH_FILE_LAST
enum MESH_TEXTURE_FORMAT

Values:

MESH_TEXTURE_RGB
MESH_TEXTURE_RGBA
MESH_TEXTURE_LAST
enum ERROR_CODE

List error codes in the ZED SDK.

Values:

SUCCESS

Standard code for successful behavior.

ERROR_CODE_FAILURE

Standard code for unsuccessful behavior.

ERROR_CODE_NO_GPU_COMPATIBLE

No GPU found or CUDA capability of the device is not supported.

ERROR_CODE_NOT_ENOUGH_GPUMEM

Not enough GPU memory for this depth mode please try a faster mode (such as PERFORMANCE mode).

ERROR_CODE_CAMERA_NOT_DETECTED

The ZED camera is not plugged or detected.

ERROR_CODE_INVALID_RESOLUTION

For Jetson only, resolution not yet supported (USB3.0 bandwidth).

ERROR_CODE_LOW_USB_BANDWIDTH

This issue can occurs when you use multiple ZED or a USB 2.0 port (bandwidth issue).

ERROR_CODE_CALIBRATION_FILE_NOT_AVAILABLE

ZED calibration file is not found on the host machine. Use ZED Explorer or ZED Calibration to get one.

ERROR_CODE_INVALID_SVO_FILE

The provided SVO file is not valid.

ERROR_CODE_SVO_RECORDING_ERROR

An recorder related error occurred (not enough free storage, invalid file).

ERROR_CODE_INVALID_COORDINATE_SYSTEM

The requested coordinate system is not available.

ERROR_CODE_INVALID_FIRMWARE

The firmware of the ZED is out of date. Update to the latest version.

ERROR_CODE_NOT_A_NEW_FRAME

in grab() only, the current call return the same frame as last call. Not a new frame.

ERROR_CODE_CUDA_ERROR

in grab() only, a CUDA error has been detected in the process. Activate verbose in sl::Camera::open for more info.

ERROR_CODE_CAMERA_NOT_INITIALIZED

in grab() only, ZED SDK is not initialized. Probably a missing call to sl::Camera::open.

ERROR_CODE_NVIDIA_DRIVER_OUT_OF_DATE

your NVIDIA driver is too old and not compatible with your current CUDA version.

ERROR_CODE_INVALID_FUNCTION_CALL

the call of the function is not valid in the current context. Could be a missing call of sl::Camera::open.

ERROR_CODE_CORRUPTED_SDK_INSTALLATION

The SDK wasn’t able to load its dependencies, the installer should be launched.

ERROR_CODE_LAST

Functions

SLSTEREO_EXPORT_DLL bool sl::saveDepthAs(sl::Camera & zed, sl::DEPTH_FORMAT format, sl::String name, float factor = 1.)

Writes the current depth map into a file.

Return
False if something wrong happen, else return true.
Parameters
  • zed: : the current camera object.
  • format: : the depth file format you desired.
  • name: : the name (path) in which the depth will be saved.
  • factor: : only for PNG and PGM, apply a gain to the depth value (default 1.). The maximum value is 65536, so you can set the Camera::setDepthClampValue to 20000 and give a factor to 3, Do not forget to scale (by 1./factor) the pixel value to get the real depth. The occlusions are represented by 0.

SLSTEREO_EXPORT_DLL bool sl::savePointCloudAs(sl::Camera & zed, sl::POINT_CLOUD_FORMAT format, sl::String name, bool with_color = false, bool keep_occluded_point = false)

Writes the current point cloud into a file.

Return
False if something wrong happen, else return true.
Note
The color is not saved for XYZ and VTK files.
Parameters
  • zed: : the current camera object.
  • format: : the point cloud file format you desired.
  • name: : the name (path) in which the point cloud will be saved.
  • with_color: : indicates if the color must be saved (default false). Not available for XYZ and VTK.
  • keep_occluded_point: : indicates if the non available data should be saved and set to 0 (default false), if set to true this give a Point Cloud with a size = height * width.

MEM operator|(MEM a, MEM b)
static cudaError __cudaSafeCall(cudaError err, const char *func, const char *file, const int line)
static std::string errorCode2str(ERROR_CODE err)
void sleep_ms(int time)

Tells the program to wait for x ms.

Parameters
  • time: : the number of ms to wait.

template <typename T>
std::ostream &operator<<(std::ostream &os, const Vector2<T> &v2)
template <typename T>
std::ostream &operator<<(std::ostream &os, const Vector3<T> &v3)
template <typename T>
std::ostream &operator<<(std::ostream &os, const Vector4<T> &v4)

Variables

class SL_CORE_EXPORT_DLL Rotation
class SL_CORE_EXPORT_DLL Translation
class SL_CORE_EXPORT_DLL Orientation
class SL_CORE_EXPORT_DLL Transform
std::vector<std::pair<int, int>> cameraResolution = [] { std::vector<std::pair<int, int>> v; v.emplace_back(2208, 1242); v.emplace_back(1920, 1080); v.emplace_back(1280, 720); v.emplace_back(672, 376); return v; }()

Available video modes for the ZED camera

namespace SpeakerDetect

@package Speakerdetect
1. Face is received from the Facedetect module
2. The speaking algorithm tries to identify speaking
3. returns a dict of people who speak and their ID

Functions

DetectSpeaker(FacepointQueue FacepointQueue, SpeakerQueue SpeakerQueue, FrameQueue FrameQueue, VisualQueue VisualQueue)
namespace std
namespace TransformCoordinates

Functions

coordinate_transform(x_camera x_camera, y_camera y_camera, z z)
namespace Visualizer

@package RecogniseFace
This is currently not used. We need to figure out way to synchronize image access across different processes.
1. One place used to visualization.

Functions

StartVisualization(CameraQueue CameraQueue, RectQueue RectQueue, FacePointQueue FacePointQueue, SpeakerQueue SpeakerQueue, FrameQueue FrameQueue, VisualQueue VisualQueue)
namespace voc_label

Functions

convert(size size, box box)
convert_annotation(year year, image_id image_id)

Variables

list voc_label.sets = [('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
list voc_label.classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
wd = getcwd()
image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
list_file = open('%s_%s.txt'%(year, image_set), 'w')
namespace zedRoboy

Functions

load_meta(f f)
load_net(cfg cfg, weights weights)
load_img(f f)
letterbox_img(im im, w w, h h)
predict(net net, im im)
classify(net net, meta meta, im im)
detect(net net, meta meta, im im)

Variables

lib = CDLL("/usr/local/.so", RTLD_GLOBAL)
net = load_net("cfg/densenet.cfg", "/home/pjreddie/trained/densenet201.weights")
im = load_img("data/wolf.jpg")
meta = load_meta("cfg/imagenet1k.data")
r = classify(net, meta, im)
file goal.txt
file make_labels.py
file DescribeSceneSrv.py
file FaceDetect.py
file FindObjectSrv.py
file FreezeModel.py
file multiprocess.py
file Multitracking.py
file NewFacialFeaturesMsg.py
file ObjectRecognition.py
file art.c
#include “darknet.h”#include <sys/time.h>

Functions

void demo_art(char *cfgfile, char *weightfile, int cam_index)
void run_art(int argc, char **argv)
file captcha.c
#include “darknet.h”

Functions

void fix_data_captcha(data d, int mask)
void train_captcha(char *cfgfile, char *weightfile)
void test_captcha(char *cfgfile, char *weightfile, char *filename)
void valid_captcha(char *cfgfile, char *weightfile, char *filename)
void run_captcha(int argc, char **argv)
file cifar.c
#include “darknet.h”

Functions

void train_cifar(char *cfgfile, char *weightfile)
void train_cifar_distill(char *cfgfile, char *weightfile)
void test_cifar_multi(char *filename, char *weightfile)
void test_cifar(char *filename, char *weightfile)
void extract_cifar()
void test_cifar_csv(char *filename, char *weightfile)
void test_cifar_csvtrain(char *filename, char *weightfile)
void eval_cifar_csv()
void run_cifar(int argc, char **argv)
file classifier.c
#include “darknet.h”#include <sys/time.h>#include <assert.h>

Functions

float *get_regression_values(char **labels, int n)
void train_classifier(char *datacfg, char *cfgfile, char *weightfile, int *gpus, int ngpus, int clear)
void validate_classifier_crop(char *datacfg, char *filename, char *weightfile)
void validate_classifier_10(char *datacfg, char *filename, char *weightfile)
void validate_classifier_full(char *datacfg, char *filename, char *weightfile)
void validate_classifier_single(char *datacfg, char *filename, char *weightfile)
void validate_classifier_multi(char *datacfg, char *filename, char *weightfile)
void try_classifier(char *datacfg, char *cfgfile, char *weightfile, char *filename, int layer_num)
void predict_classifier(char *datacfg, char *cfgfile, char *weightfile, char *filename, int top)
void label_classifier(char *datacfg, char *filename, char *weightfile)
void test_classifier(char *datacfg, char *cfgfile, char *weightfile, int target_layer)
void threat_classifier(char *datacfg, char *cfgfile, char *weightfile, int cam_index, const char *filename)
void gun_classifier(char *datacfg, char *cfgfile, char *weightfile, int cam_index, const char *filename)
void demo_classifier(char *datacfg, char *cfgfile, char *weightfile, int cam_index, const char *filename)
void run_classifier(int argc, char **argv)
file coco.c
#include “darknet.h”#include <stdio.h>

Functions

void train_coco(char *cfgfile, char *weightfile)
void print_cocos(FILE *fp, int image_id, box *boxes, float **probs, int num_boxes, int classes, int w, int h)
int get_coco_image_id(char *filename)
void validate_coco(char *cfgfile, char *weightfile)
void validate_coco_recall(char *cfgfile, char *weightfile)
void test_coco(char *cfgfile, char *weightfile, char *filename, float thresh)
void run_coco(int argc, char **argv)

Variables

char *coco_classes[] = {"person","bicycle","car","motorcycle","airplane","bus","train","truck","boat","traffic light","fire hydrant","stop sign","parking meter","bench","bird","cat","dog","horse","sheep","cow","elephant","bear","zebra","giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee","skis","snowboard","sports ball","kite","baseball bat","baseball glove","skateboard","surfboard","tennis racket","bottle","wine glass","cup","fork","knife","spoon","bowl","banana","apple","sandwich","orange","broccoli","carrot","hot dog","pizza","donut","cake","chair","couch","potted plant","bed","dining table","toilet","tv","laptop","mouse","remote","keyboard","cell phone","microwave","oven","toaster","sink","refrigerator","book","clock","vase","scissors","teddy bear","hair drier","toothbrush"}
int coco_ids[] = {1,2,3,4,5,6,7,8,9,10,11,13,14,15,16,17,18,19,20,21,22,23,24,25,27,28,31,32,33,34,35,36,37,38,39,40,41,42,43,44,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,67,70,72,73,74,75,76,77,78,79,80,81,82,84,85,86,87,88,89,90}
file darknet.c
#include “darknet.h”#include <time.h>#include <stdlib.h>#include <stdio.h>

Functions

void predict_classifier(char *datacfg, char *cfgfile, char *weightfile, char *filename, int top)
void test_detector(char *datacfg, char *cfgfile, char *weightfile, char *filename, float thresh, float hier_thresh, char *outfile, int fullscreen)
void run_voxel(int argc, char **argv)
void run_yolo(int argc, char **argv)
void run_detector(int argc, char **argv)
void run_coco(int argc, char **argv)
void run_writing(int argc, char **argv)
void run_captcha(int argc, char **argv)
void run_nightmare(int argc, char **argv)
void run_dice(int argc, char **argv)
void run_compare(int argc, char **argv)
void run_classifier(int argc, char **argv)
void run_regressor(int argc, char **argv)
void run_segmenter(int argc, char **argv)
void run_char_rnn(int argc, char **argv)
void run_vid_rnn(int argc, char **argv)
void run_tag(int argc, char **argv)
void run_cifar(int argc, char **argv)
void run_go(int argc, char **argv)
void run_art(int argc, char **argv)
void run_super(int argc, char **argv)
void run_lsd(int argc, char **argv)
void average(int argc, char *argv[])
void speed(char *cfgfile, int tics)
void operations(char *cfgfile)
void oneoff(char *cfgfile, char *weightfile, char *outfile)
void oneoff2(char *cfgfile, char *weightfile, char *outfile, int l)
void partial(char *cfgfile, char *weightfile, char *outfile, int max)
void rescale_net(char *cfgfile, char *weightfile, char *outfile)
void rgbgr_net(char *cfgfile, char *weightfile, char *outfile)
void reset_normalize_net(char *cfgfile, char *weightfile, char *outfile)
layer normalize_layer(layer l, int n)
void normalize_net(char *cfgfile, char *weightfile, char *outfile)
void statistics_net(char *cfgfile, char *weightfile)
void denormalize_net(char *cfgfile, char *weightfile, char *outfile)
void mkimg(char *cfgfile, char *weightfile, int h, int w, int num, char *prefix)
void visualize(char *cfgfile, char *weightfile)
int main(int argc, char **argv)
file detector.c
#include “darknet.h”

Functions

void train_detector(char *datacfg, char *cfgfile, char *weightfile, int *gpus, int ngpus, int clear)
static int get_coco_image_id(char *filename)
static void print_cocos(FILE *fp, char *image_path, box *boxes, float **probs, int num_boxes, int classes, int w, int h)
void print_detector_detections(FILE **fps, char *id, box *boxes, float **probs, int total, int classes, int w, int h)
void print_imagenet_detections(FILE *fp, int id, box *boxes, float **probs, int total, int classes, int w, int h)
void validate_detector_flip(char *datacfg, char *cfgfile, char *weightfile, char *outfile)
void validate_detector(char *datacfg, char *cfgfile, char *weightfile, char *outfile)
void validate_detector_recall(char *cfgfile, char *weightfile)
void test_detector(char *datacfg, char *cfgfile, char *weightfile, char *filename, float thresh, float hier_thresh, char *outfile, int fullscreen)
void run_detector(int argc, char **argv)

Variables

int coco_ids[] = {1,2,3,4,5,6,7,8,9,10,11,13,14,15,16,17,18,19,20,21,22,23,24,25,27,28,31,32,33,34,35,36,37,38,39,40,41,42,43,44,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,67,70,72,73,74,75,76,77,78,79,80,81,82,84,85,86,87,88,89,90}
file dice.c
#include “darknet.h”

Functions

void train_dice(char *cfgfile, char *weightfile)
void validate_dice(char *filename, char *weightfile)
void test_dice(char *cfgfile, char *weightfile, char *filename)
void run_dice(int argc, char **argv)

Variables

char *dice_labels[] = {"face1","face2","face3","face4","face5","face6"}
file go.c
#include “darknet.h”#include <unistd.h>

Functions

char *fgetgo(FILE *fp)
moves load_go_moves(char *filename)
void string_to_board(char *s, float *board)
void board_to_string(char *s, float *board)
data random_go_moves(moves m, int n)
void train_go(char *cfgfile, char *weightfile, char *filename, int *gpus, int ngpus, int clear)
void propagate_liberty(float *board, int *lib, int *visited, int row, int col, int side)
int *calculate_liberties(float *board)
void print_board(FILE *stream, float *board, int swap, int *indexes)
void flip_board(float *board)
void predict_move(network net, float *board, float *move, int multi)
void remove_connected(float *b, int *lib, int p, int r, int c)
void move_go(float *b, int p, int r, int c)
int makes_safe_go(float *b, int *lib, int p, int r, int c)
int suicide_go(float *b, int p, int r, int c)
int legal_go(float *b, char *ko, int p, int r, int c)
int generate_move(network net, int player, float *board, int multi, float thresh, float temp, char *ko, int print)
void valid_go(char *cfgfile, char *weightfile, int multi, char *filename)
int print_game(float *board, FILE *fp)
void engine_go(char *filename, char *weightfile, int multi)
void test_go(char *cfg, char *weights, int multi)
float score_game(float *board)
void self_go(char *filename, char *weightfile, char *f2, char *w2, int multi)
void run_go(int argc, char **argv)

Variables

int inverted = 1
int noi = 1
const int nind = 2
file lsd.c
#include “darknet.h”

Functions

void test_dcgan(char *cfgfile, char *weightfile)
void dcgan_batch(network gnet, network anet)
void train_dcgan(char *cfg, char *weight, char *acfg, char *aweight, int clear, int display, char *train_images)
void train_colorizer(char *cfg, char *weight, char *acfg, char *aweight, int clear, int display)
void test_lsd(char *cfgfile, char *weightfile, char *filename, int gray)
void run_lsd(int argc, char **argv)
file nightmare.c
#include “darknet.h”#include <math.h>

Functions

float abs_mean(float *x, int n)
void calculate_loss(float *output, float *delta, int n, float thresh)
void optimize_picture(network *net, image orig, int max_layer, float scale, float rate, float thresh, int norm)
void smooth(image recon, image update, float lambda, int num)
void reconstruct_picture(network net, float *features, image recon, image update, float rate, float momentum, float lambda, int smooth_size, int iters)
void run_nightmare(int argc, char **argv)
file regressor.c
#include “darknet.h”#include <sys/time.h>#include <assert.h>

Functions

void train_regressor(char *datacfg, char *cfgfile, char *weightfile, int *gpus, int ngpus, int clear)
void predict_regressor(char *cfgfile, char *weightfile, char *filename)
void demo_regressor(char *datacfg, char *cfgfile, char *weightfile, int cam_index, const char *filename)
void run_regressor(int argc, char **argv)
file rnn.c
#include “darknet.h”#include <math.h>

Functions

int *read_tokenized_data(char *filename, size_t *read)
char **read_tokens(char *filename, size_t *read)
float_pair get_rnn_token_data(int *tokens, size_t *offsets, int characters, size_t len, int batch, int steps)
float_pair get_rnn_data(unsigned char *text, size_t *offsets, int characters, size_t len, int batch, int steps)
void reset_rnn_state(network net, int b)
void train_char_rnn(char *cfgfile, char *weightfile, char *filename, int clear, int tokenized)
void print_symbol(int n, char **tokens)
void test_char_rnn(char *cfgfile, char *weightfile, int num, char *seed, float temp, int rseed, char *token_file)
void test_tactic_rnn_multi(char *cfgfile, char *weightfile, int num, float temp, int rseed, char *token_file)
void test_tactic_rnn(char *cfgfile, char *weightfile, int num, float temp, int rseed, char *token_file)
void valid_tactic_rnn(char *cfgfile, char *weightfile, char *seed)
void valid_char_rnn(char *cfgfile, char *weightfile, char *seed)
void vec_char_rnn(char *cfgfile, char *weightfile, char *seed)
void run_char_rnn(int argc, char **argv)
file rnn_vid.c
#include “darknet.h”

Functions

void run_vid_rnn(int argc, char **argv)
file segmenter.c
#include “darknet.h”#include <sys/time.h>#include <assert.h>

Functions

void train_segmenter(char *datacfg, char *cfgfile, char *weightfile, int *gpus, int ngpus, int clear, int display)
void predict_segmenter(char *datafile, char *cfgfile, char *weightfile, char *filename)
void demo_segmenter(char *datacfg, char *cfgfile, char *weightfile, int cam_index, const char *filename)
void run_segmenter(int argc, char **argv)
file super.c
#include “darknet.h”

Functions

void train_super(char *cfgfile, char *weightfile, int clear)
void test_super(char *cfgfile, char *weightfile, char *filename)
void run_super(int argc, char **argv)
file swag.c
#include “darknet.h”#include <sys/time.h>

Functions

void train_swag(char *cfgfile, char *weightfile)
void run_swag(int argc, char **argv)
file tag.c
#include “darknet.h”

Functions

void train_tag(char *cfgfile, char *weightfile, int clear)
void test_tag(char *cfgfile, char *weightfile, char *filename)
void run_tag(int argc, char **argv)
file voxel.c
#include “darknet.h”

Functions

void extract_voxel(char *lfile, char *rfile, char *prefix)
void train_voxel(char *cfgfile, char *weightfile)
void test_voxel(char *cfgfile, char *weightfile, char *filename)
void run_voxel(int argc, char **argv)
file writing.c
#include “darknet.h”

Functions

void train_writing(char *cfgfile, char *weightfile)
void test_writing(char *cfgfile, char *weightfile, char *filename)
void run_writing(int argc, char **argv)
file yolo.c
#include “darknet.h”

Functions

void train_yolo(char *cfgfile, char *weightfile)
void print_yolo_detections(FILE **fps, char *id, box *boxes, float **probs, int total, int classes, int w, int h)
void validate_yolo(char *cfgfile, char *weightfile)
void validate_yolo_recall(char *cfgfile, char *weightfile)
void test_yolo(char *cfgfile, char *weightfile, char *filename, float thresh)
void run_yolo(int argc, char **argv)

Variables

char *voc_names[] = {"aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"}
file darknet.h
#include <stdlib.h>#include <stdio.h>#include <string.h>#include <pthread.h>

Defines

SECRET_NUM

Typedefs

typedef struct network network
typedef struct layer layer
typedef struct matrix matrix
typedef struct load_args load_args
typedef struct node node
typedef struct list list

Enums

enum ACTIVATION

Values:

LOGISTIC
RELU
RELIE
LINEAR
RAMP
TANH
PLSE
LEAKY
ELU
LOGGY
STAIR
HARDTAN
LHTAN
enum LAYER_TYPE

Values:

CONVOLUTIONAL
DECONVOLUTIONAL
CONNECTED
MAXPOOL
SOFTMAX
DETECTION
DROPOUT
CROP
ROUTE
COST
NORMALIZATION
AVGPOOL
LOCAL
SHORTCUT
ACTIVE
RNN
GRU
LSTM
CRNN
BATCHNORM
NETWORK
XNOR
REGION
REORG
BLANK
enum COST_TYPE

Values:

SSE
MASKED
L1
SEG
SMOOTH
enum learning_rate_policy

Values:

CONSTANT
STEP
EXP
POLY
STEPS
SIG
RANDOM
enum data_type

Values:

CLASSIFICATION_DATA
DETECTION_DATA
CAPTCHA_DATA
REGION_DATA
IMAGE_DATA
COMPARE_DATA
WRITING_DATA
SWAG_DATA
TAG_DATA
OLD_CLASSIFICATION_DATA
STUDY_DATA
DET_DATA
SUPER_DATA
LETTERBOX_DATA
REGRESSION_DATA
SEGMENTATION_DATA
INSTANCE_DATA

Functions

metadata get_metadata(char *file)
void free_layer(layer)
network load_network(char *cfg, char *weights, int clear)
network *load_network_p(char *cfg, char *weights, int clear)
load_args get_base_args(network net)
void free_data(data d)
pthread_t load_data(load_args args)
list *read_data_cfg(char *filename)
list *read_cfg(char *filename)
void forward_network(network net)
void backward_network(network net)
void update_network(network net)
void axpy_cpu(int N, float ALPHA, float *X, int INCX, float *Y, int INCY)
void copy_cpu(int N, float *X, int INCX, float *Y, int INCY)
void scal_cpu(int N, float ALPHA, float *X, int INCX)
void normalize_cpu(float *x, float *mean, float *variance, int batch, int filters, int spatial)
int best_3d_shift_r(image a, image b, int min, int max)
void save_image_png(image im, const char *name)
void get_next_batch(data d, int n, int offset, float *X, float *y)
void grayscale_image_3c(image im)
void normalize_image(image p)
void matrix_to_csv(matrix m)
float train_network_sgd(network net, data d, int n)
void rgbgr_image(image im)
data copy_data(data d)
data concat_data(data d1, data d2)
data load_cifar10_data(char *filename)
float matrix_topk_accuracy(matrix truth, matrix guess, int k)
void matrix_add_matrix(matrix from, matrix to)
void scale_matrix(matrix m, float scale)
matrix csv_to_matrix(char *filename)
float *network_accuracies(network net, data d, int n)
float train_network_datum(network net)
image make_random_image(int w, int h, int c)
void denormalize_connected_layer(layer l)
void denormalize_convolutional_layer(layer l)
void statistics_connected_layer(layer l)
void rescale_weights(layer l, float scale, float trans)
void rgbgr_weights(layer l)
image *get_weights(layer l)
void demo(char *cfgfile, char *weightfile, float thresh, int cam_index, const char *filename, char **names, int classes, int frame_skip, char *prefix, int avg, float hier_thresh, int w, int h, int fps, int fullscreen)
void get_detection_boxes(layer l, int w, int h, float thresh, float **probs, box *boxes, int only_objectness)
char *option_find_str(list *l, char *key, char *def)
int option_find_int(list *l, char *key, int def)
network parse_network_cfg(char *filename)
void save_weights(network net, char *filename)
void load_weights(network *net, char *filename)
void save_weights_upto(network net, char *filename, int cutoff)
void load_weights_upto(network *net, char *filename, int start, int cutoff)
void zero_objectness(layer l)
void get_region_boxes(layer l, int w, int h, int netw, int neth, float thresh, float **probs, box *boxes, float **masks, int only_objectness, int *map, float tree_thresh, int relative)
void free_network(network net)
void set_batch_network(network *net, int b)
image load_image(char *filename, int w, int h, int c)
image load_image_color(char *filename, int w, int h)
image make_image(int w, int h, int c)
image resize_image(image im, int w, int h)
image letterbox_image(image im, int w, int h)
image crop_image(image im, int dx, int dy, int w, int h)
image resize_min(image im, int min)
image threshold_image(image im, float thresh)
image mask_to_rgb(image mask)
int resize_network(network *net, int w, int h)
void free_matrix(matrix m)
void test_resize(char *filename)
void save_image(image p, const char *name)
void show_image(image p, const char *name)
image copy_image(image p)
void draw_box_width(image a, int x1, int y1, int x2, int y2, int w, float r, float g, float b)
float get_current_rate(network net)
void composite_3d(char *f1, char *f2, char *out, int delta)
data load_data_old(char **paths, int n, int m, char **labels, int k, int w, int h)
size_t get_current_batch(network net)
void constrain_image(image im)
image get_network_image_layer(network net, int i)
layer get_network_output_layer(network net)
void top_predictions(network net, int n, int *index)
void flip_image(image a)
image float_to_image(int w, int h, int c, float *data)
void ghost_image(image source, image dest, int dx, int dy)
float network_accuracy(network net, data d)
void random_distort_image(image im, float hue, float saturation, float exposure)
void fill_image(image m, float s)
image grayscale_image(image im)
void rotate_image_cw(image im, int times)
double what_time_is_it_now()
image rotate_image(image m, float rad)
void visualize_network(network net)
float box_iou(box a, box b)
void do_nms(box *boxes, float **probs, int total, int classes, float thresh)
data load_all_cifar10()
box_label *read_boxes(char *filename, int *n)
box float_to_box(float *f, int stride)
void draw_detections(image im, int num, float thresh, box *boxes, float **probs, float **masks, char **names, image **alphabet, int classes)
matrix network_predict_data(network net, data test)
image **load_alphabet()
image get_network_image(network net)
float *network_predict(network net, float *input)
float *network_predict_p(network *net, float *input)
int network_width(network *net)
int network_height(network *net)
float *network_predict_image(network *net, image im)
char **get_labels(char *filename)
void do_nms_sort(box *boxes, float **probs, int total, int classes, float thresh)
void do_nms_obj(box *boxes, float **probs, int total, int classes, float thresh)
matrix make_matrix(int rows, int cols)
void free_image(image m)
float train_network(network net, data d)
pthread_t load_data_in_thread(load_args args)
void load_data_blocking(load_args args)
list *get_paths(char *filename)
void hierarchy_predictions(float *predictions, int n, tree *hier, int only_leaves, int stride)
void change_leaves(tree *t, char *leaf_list)
int find_int_arg(int argc, char **argv, char *arg, int def)
float find_float_arg(int argc, char **argv, char *arg, float def)
int find_arg(int argc, char *argv[], char *arg)
char *find_char_arg(int argc, char **argv, char *arg, char *def)
char *basecfg(char *cfgfile)
void find_replace(char *str, char *orig, char *rep, char *output)
void free_ptrs(void **ptrs, int n)
char *fgetl(FILE *fp)
void strip(char *s)
float sec(clock_t clocks)
void **list_to_array(list *l)
void top_k(float *a, int n, int k, int *index)
int *read_map(char *filename)
void error(const char *s)
int max_index(float *a, int n)
int sample_array(float *a, int n)
void free_list(list *l)
float mse_array(float *a, int n)
float variance_array(float *a, int n)
float mag_array(float *a, int n)
float mean_array(float *a, int n)
void normalize_array(float *a, int n)
int *read_intlist(char *s, int *n, int d)
size_t rand_size_t()
float rand_normal()

Variables

int gpu_index
file darknet.py
file README.md
file README.md
file voc_label.py
file activation_layer.c
#include “activation_layer.h”#include “utils.h”#include “cuda.h”#include “blas.h”#include “gemm.h”#include <math.h>#include <stdio.h>#include <stdlib.h>#include <string.h>

Functions

layer make_activation_layer(int batch, int inputs, ACTIVATION activation)
void forward_activation_layer(layer l, network net)
void backward_activation_layer(layer l, network net)
file activation_layer.h
#include “activations.h”#include “layer.h”#include “network.h”

Functions

layer make_activation_layer(int batch, int inputs, ACTIVATION activation)
void forward_activation_layer(layer l, network net)
void backward_activation_layer(layer l, network net)
file activations.c
#include “activations.h”#include <math.h>#include <stdio.h>#include <stdlib.h>#include <string.h>

Functions

char *get_activation_string(ACTIVATION a)
ACTIVATION get_activation(char *s)
float activate(float x, ACTIVATION a)
void activate_array(float *x, const int n, const ACTIVATION a)
float gradient(float x, ACTIVATION a)
void gradient_array(const float *x, const int n, const ACTIVATION a, float *delta)
file activations.h
#include “darknet.h”#include “cuda.h”#include “math.h”

Functions

ACTIVATION get_activation(char *s)
char *get_activation_string(ACTIVATION a)
float activate(float x, ACTIVATION a)
float gradient(float x, ACTIVATION a)
void gradient_array(const float *x, const int n, const ACTIVATION a, float *delta)
void activate_array(float *x, const int n, const ACTIVATION a)
static float stair_activate(float x)
static float hardtan_activate(float x)
static float linear_activate(float x)
static float logistic_activate(float x)
static float loggy_activate(float x)
static float relu_activate(float x)
static float elu_activate(float x)
static float relie_activate(float x)
static float ramp_activate(float x)
static float leaky_activate(float x)
static float tanh_activate(float x)
static float plse_activate(float x)
static float lhtan_activate(float x)
static float lhtan_gradient(float x)
static float hardtan_gradient(float x)
static float linear_gradient(float x)
static float logistic_gradient(float x)
static float loggy_gradient(float x)
static float stair_gradient(float x)
static float relu_gradient(float x)
static float elu_gradient(float x)
static float relie_gradient(float x)
static float ramp_gradient(float x)
static float leaky_gradient(float x)
static float tanh_gradient(float x)
static float plse_gradient(float x)
file avgpool_layer.c
#include “avgpool_layer.h”#include “cuda.h”#include <stdio.h>

Functions

avgpool_layer make_avgpool_layer(int batch, int w, int h, int c)
void resize_avgpool_layer(avgpool_layer *l, int w, int h)
void forward_avgpool_layer(const avgpool_layer l, network net)
void backward_avgpool_layer(const avgpool_layer l, network net)
file avgpool_layer.h
#include “image.h”#include “cuda.h”#include “layer.h”#include “network.h”

Typedefs

typedef layer avgpool_layer

Functions

image get_avgpool_image(avgpool_layer l)
avgpool_layer make_avgpool_layer(int batch, int w, int h, int c)
void resize_avgpool_layer(avgpool_layer *l, int w, int h)
void forward_avgpool_layer(const avgpool_layer l, network net)
void backward_avgpool_layer(const avgpool_layer l, network net)
file batchnorm_layer.c
#include “convolutional_layer.h”#include “batchnorm_layer.h”#include “blas.h”#include <stdio.h>

Functions

layer make_batchnorm_layer(int batch, int w, int h, int c)
void backward_scale_cpu(float *x_norm, float *delta, int batch, int n, int size, float *scale_updates)
void mean_delta_cpu(float *delta, float *variance, int batch, int filters, int spatial, float *mean_delta)
void variance_delta_cpu(float *x, float *delta, float *mean, float *variance, int batch, int filters, int spatial, float *variance_delta)
void normalize_delta_cpu(float *x, float *mean, float *variance, float *mean_delta, float *variance_delta, int batch, int filters, int spatial, float *delta)
void resize_batchnorm_layer(layer *layer, int w, int h)
void forward_batchnorm_layer(layer l, network net)
void backward_batchnorm_layer(layer l, network net)
file batchnorm_layer.h
#include “image.h”#include “layer.h”#include “network.h”

Functions

layer make_batchnorm_layer(int batch, int w, int h, int c)
void forward_batchnorm_layer(layer l, network net)
void backward_batchnorm_layer(layer l, network net)
file blas.c
#include “blas.h”#include <math.h>#include <assert.h>#include <float.h>#include <stdio.h>#include <stdlib.h>#include <string.h>

Functions

void reorg_cpu(float *x, int w, int h, int c, int batch, int stride, int forward, float *out)
void flatten(float *x, int size, int layers, int batch, int forward)
void weighted_sum_cpu(float *a, float *b, float *s, int n, float *c)
void weighted_delta_cpu(float *a, float *b, float *s, float *da, float *db, float *ds, int n, float *dc)
void shortcut_cpu(int batch, int w1, int h1, int c1, float *add, int w2, int h2, int c2, float *out)
void mean_cpu(float *x, int batch, int filters, int spatial, float *mean)
void variance_cpu(float *x, float *mean, int batch, int filters, int spatial, float *variance)
void normalize_cpu(float *x, float *mean, float *variance, int batch, int filters, int spatial)
void const_cpu(int N, float ALPHA, float *X, int INCX)
void mul_cpu(int N, float *X, int INCX, float *Y, int INCY)
void pow_cpu(int N, float ALPHA, float *X, int INCX, float *Y, int INCY)
void axpy_cpu(int N, float ALPHA, float *X, int INCX, float *Y, int INCY)
void scal_cpu(int N, float ALPHA, float *X, int INCX)
void fill_cpu(int N, float ALPHA, float *X, int INCX)
void deinter_cpu(int NX, float *X, int NY, float *Y, int B, float *OUT)
void inter_cpu(int NX, float *X, int NY, float *Y, int B, float *OUT)
void copy_cpu(int N, float *X, int INCX, float *Y, int INCY)
void mult_add_into_cpu(int N, float *X, float *Y, float *Z)
void smooth_l1_cpu(int n, float *pred, float *truth, float *delta, float *error)
void l1_cpu(int n, float *pred, float *truth, float *delta, float *error)
void l2_cpu(int n, float *pred, float *truth, float *delta, float *error)
float dot_cpu(int N, float *X, int INCX, float *Y, int INCY)
void softmax(float *input, int n, float temp, int stride, float *output)
void softmax_cpu(float *input, int n, int batch, int batch_offset, int groups, int group_offset, int stride, float temp, float *output)
file blas.h
#include “darknet.h”

Functions

void flatten(float *x, int size, int layers, int batch, int forward)
void pm(int M, int N, float *A)
float *random_matrix(int rows, int cols)
void time_random_matrix(int TA, int TB, int m, int k, int n)
void reorg_cpu(float *x, int w, int h, int c, int batch, int stride, int forward, float *out)
void test_blas()
void inter_cpu(int NX, float *X, int NY, float *Y, int B, float *OUT)
void deinter_cpu(int NX, float *X, int NY, float *Y, int B, float *OUT)
void mult_add_into_cpu(int N, float *X, float *Y, float *Z)
void const_cpu(int N, float ALPHA, float *X, int INCX)
void constrain_gpu(int N, float ALPHA, float *X, int INCX)
void pow_cpu(int N, float ALPHA, float *X, int INCX, float *Y, int INCY)
void mul_cpu(int N, float *X, int INCX, float *Y, int INCY)
void fill_cpu(int N, float ALPHA, float *X, int INCX)
float dot_cpu(int N, float *X, int INCX, float *Y, int INCY)
int test_gpu_blas()
void shortcut_cpu(int batch, int w1, int h1, int c1, float *add, int w2, int h2, int c2, float *out)
void mean_cpu(float *x, int batch, int filters, int spatial, float *mean)
void variance_cpu(float *x, float *mean, int batch, int filters, int spatial, float *variance)
void scale_bias(float *output, float *scales, int batch, int n, int size)
void backward_scale_cpu(float *x_norm, float *delta, int batch, int n, int size, float *scale_updates)
void mean_delta_cpu(float *delta, float *variance, int batch, int filters, int spatial, float *mean_delta)
void variance_delta_cpu(float *x, float *delta, float *mean, float *variance, int batch, int filters, int spatial, float *variance_delta)
void normalize_delta_cpu(float *x, float *mean, float *variance, float *mean_delta, float *variance_delta, int batch, int filters, int spatial, float *delta)
void smooth_l1_cpu(int n, float *pred, float *truth, float *delta, float *error)
void l2_cpu(int n, float *pred, float *truth, float *delta, float *error)
void l1_cpu(int n, float *pred, float *truth, float *delta, float *error)
void weighted_sum_cpu(float *a, float *b, float *s, int num, float *c)
void weighted_delta_cpu(float *a, float *b, float *s, float *da, float *db, float *ds, int n, float *dc)
void softmax(float *input, int n, float temp, int stride, float *output)
void softmax_cpu(float *input, int n, int batch, int batch_offset, int groups, int group_offset, int stride, float temp, float *output)
file box.c
#include “box.h”#include <stdio.h>#include <math.h>#include <stdlib.h>

Functions

box float_to_box(float *f, int stride)
dbox derivative(box a, box b)
float overlap(float x1, float w1, float x2, float w2)
float box_intersection(box a, box b)
float box_union(box a, box b)
float box_iou(box a, box b)
float box_rmse(box a, box b)
dbox dintersect(box a, box b)
dbox dunion(box a, box b)
void test_dunion()
void test_dintersect()
void test_box()
dbox diou(box a, box b)
int nms_comparator(const void *pa, const void *pb)
void do_nms_obj(box *boxes, float **probs, int total, int classes, float thresh)
void do_nms_sort(box *boxes, float **probs, int total, int classes, float thresh)
void do_nms(box *boxes, float **probs, int total, int classes, float thresh)
box encode_box(box b, box anchor)
box decode_box(box b, box anchor)
file box.h
#include “darknet.h”

Functions

float box_rmse(box a, box b)
dbox diou(box a, box b)
box decode_box(box b, box anchor)
box encode_box(box b, box anchor)
file classifier.h
file col2im.c
#include <stdio.h>#include <math.h>

Functions

void col2im_add_pixel(float *im, int height, int width, int channels, int row, int col, int channel, int pad, float val)
void col2im_cpu(float *data_col, int channels, int height, int width, int ksize, int stride, int pad, float *data_im)
file col2im.h

Functions

void col2im_cpu(float *data_col, int channels, int height, int width, int ksize, int stride, int pad, float *data_im)
file compare.c
#include <stdio.h>#include “network.h”#include “detection_layer.h”#include “cost_layer.h”#include “utils.h”#include “parser.h”#include “box.h”

Functions

void train_compare(char *cfgfile, char *weightfile)
void validate_compare(char *filename, char *weightfile)
int elo_comparator(const void *a, const void *b)
int bbox_comparator(const void *a, const void *b)
void bbox_update(sortable_bbox *a, sortable_bbox *b, int class, int result)
void bbox_fight(network net, sortable_bbox *a, sortable_bbox *b, int classes, int class)
void SortMaster3000(char *filename, char *weightfile)
void BattleRoyaleWithCheese(char *filename, char *weightfile)
void run_compare(int argc, char **argv)

Variables

int total_compares = 0
int current_class = 0
file connected_layer.c
#include “connected_layer.h”#include “convolutional_layer.h”#include “batchnorm_layer.h”#include “utils.h”#include “cuda.h”#include “blas.h”#include “gemm.h”#include <math.h>#include <stdio.h>#include <stdlib.h>#include <string.h>

Functions

layer make_connected_layer(int batch, int inputs, int outputs, ACTIVATION activation, int batch_normalize, int adam)
void update_connected_layer(layer l, update_args a)
void forward_connected_layer(layer l, network net)
void backward_connected_layer(layer l, network net)
void denormalize_connected_layer(layer l)
void statistics_connected_layer(layer l)
file connected_layer.h
#include “activations.h”#include “layer.h”#include “network.h”

Functions

layer make_connected_layer(int batch, int inputs, int outputs, ACTIVATION activation, int batch_normalize, int adam)
void forward_connected_layer(layer l, network net)
void backward_connected_layer(layer l, network net)
void update_connected_layer(layer l, update_args a)
file convolutional_layer.c
#include “convolutional_layer.h”#include “utils.h”#include “batchnorm_layer.h”#include “im2col.h”#include “col2im.h”#include “blas.h”#include “gemm.h”#include <stdio.h>#include <time.h>

Functions

void swap_binary(convolutional_layer *l)
void binarize_weights(float *weights, int n, int size, float *binary)
void binarize_cpu(float *input, int n, float *binary)
void binarize_input(float *input, int n, int size, float *binary)
int convolutional_out_height(convolutional_layer l)
int convolutional_out_width(convolutional_layer l)
image get_convolutional_image(convolutional_layer l)
image get_convolutional_delta(convolutional_layer l)
static size_t get_workspace_size(layer l)
convolutional_layer make_convolutional_layer(int batch, int h, int w, int c, int n, int size, int stride, int padding, ACTIVATION activation, int batch_normalize, int binary, int xnor, int adam)
void denormalize_convolutional_layer(convolutional_layer l)
void resize_convolutional_layer(convolutional_layer *l, int w, int h)
void add_bias(float *output, float *biases, int batch, int n, int size)
void scale_bias(float *output, float *scales, int batch, int n, int size)
void backward_bias(float *bias_updates, float *delta, int batch, int n, int size)
void forward_convolutional_layer(convolutional_layer l, network net)
void backward_convolutional_layer(convolutional_layer l, network net)
void update_convolutional_layer(convolutional_layer l, update_args a)
image get_convolutional_weight(convolutional_layer l, int i)
void rgbgr_weights(convolutional_layer l)
void rescale_weights(convolutional_layer l, float scale, float trans)
image *get_weights(convolutional_layer l)
image *visualize_convolutional_layer(convolutional_layer l, char *window, image *prev_weights)
file convolutional_layer.h
#include “cuda.h”#include “image.h”#include “activations.h”#include “layer.h”#include “network.h”

Typedefs

typedef layer convolutional_layer

Functions

convolutional_layer make_convolutional_layer(int batch, int h, int w, int c, int n, int size, int stride, int padding, ACTIVATION activation, int batch_normalize, int binary, int xnor, int adam)
void resize_convolutional_layer(convolutional_layer *layer, int w, int h)
void forward_convolutional_layer(const convolutional_layer layer, network net)
void update_convolutional_layer(convolutional_layer layer, update_args a)
image *visualize_convolutional_layer(convolutional_layer layer, char *window, image *prev_weights)
void binarize_weights(float *weights, int n, int size, float *binary)
void swap_binary(convolutional_layer *l)
void binarize_weights2(float *weights, int n, int size, char *binary, float *scales)
void backward_convolutional_layer(convolutional_layer layer, network net)
void add_bias(float *output, float *biases, int batch, int n, int size)
void backward_bias(float *bias_updates, float *delta, int batch, int n, int size)
image get_convolutional_image(convolutional_layer layer)
image get_convolutional_delta(convolutional_layer layer)
image get_convolutional_weight(convolutional_layer layer, int i)
int convolutional_out_height(convolutional_layer layer)
int convolutional_out_width(convolutional_layer layer)
file cost_layer.c
#include “cost_layer.h”#include “utils.h”#include “cuda.h”#include “blas.h”#include <math.h>#include <string.h>#include <stdlib.h>#include <stdio.h>

Functions

COST_TYPE get_cost_type(char *s)
char *get_cost_string(COST_TYPE a)
cost_layer make_cost_layer(int batch, int inputs, COST_TYPE cost_type, float scale)
void resize_cost_layer(cost_layer *l, int inputs)
void forward_cost_layer(cost_layer l, network net)
void backward_cost_layer(const cost_layer l, network net)
file cost_layer.h
#include “layer.h”#include “network.h”

Typedefs

typedef layer cost_layer

Functions

COST_TYPE get_cost_type(char *s)
char *get_cost_string(COST_TYPE a)
cost_layer make_cost_layer(int batch, int inputs, COST_TYPE type, float scale)
void forward_cost_layer(const cost_layer l, network net)
void backward_cost_layer(const cost_layer l, network net)
void resize_cost_layer(cost_layer *l, int inputs)
file crnn_layer.c
#include “crnn_layer.h”#include “convolutional_layer.h”#include “utils.h”#include “cuda.h”#include “blas.h”#include “gemm.h”#include <math.h>#include <stdio.h>#include <stdlib.h>#include <string.h>

Functions

static void increment_layer(layer *l, int steps)
layer make_crnn_layer(int batch, int h, int w, int c, int hidden_filters, int output_filters, int steps, ACTIVATION activation, int batch_normalize)
void update_crnn_layer(layer l, update_args a)
void forward_crnn_layer(layer l, network net)
void backward_crnn_layer(layer l, network net)
file crnn_layer.h
#include “activations.h”#include “layer.h”#include “network.h”

Functions

layer make_crnn_layer(int batch, int h, int w, int c, int hidden_filters, int output_filters, int steps, ACTIVATION activation, int batch_normalize)
void forward_crnn_layer(layer l, network net)
void backward_crnn_layer(layer l, network net)
void update_crnn_layer(layer l, update_args a)
file crop_layer.c
#include “crop_layer.h”#include “cuda.h”#include <stdio.h>

Functions

image get_crop_image(crop_layer l)
void backward_crop_layer(const crop_layer l, network net)
void backward_crop_layer_gpu(const crop_layer l, network net)
crop_layer make_crop_layer(int batch, int h, int w, int c, int crop_height, int crop_width, int flip, float angle, float saturation, float exposure)
void resize_crop_layer(layer *l, int w, int h)
void forward_crop_layer(const crop_layer l, network net)
file crop_layer.h
#include “image.h”#include “layer.h”#include “network.h”

Typedefs

typedef layer crop_layer

Functions

image get_crop_image(crop_layer l)
crop_layer make_crop_layer(int batch, int h, int w, int c, int crop_height, int crop_width, int flip, float angle, float saturation, float exposure)
void forward_crop_layer(const crop_layer l, network net)
void resize_crop_layer(layer *l, int w, int h)
file cuda.c

Variables

int gpu_index = 0
file cuda.h
#include “darknet.h”
file data.c
#include “data.h”#include “utils.h”#include “image.h”#include “cuda.h”#include <stdio.h>#include <stdlib.h>#include <string.h>

Defines

NUMCHARS

Functions

list *get_paths(char *filename)
char **get_random_paths(char **paths, int n, int m)
char **find_replace_paths(char **paths, int n, char *find, char *replace)
matrix load_image_paths_gray(char **paths, int n, int w, int h)
matrix load_image_paths(char **paths, int n, int w, int h)
matrix load_image_augment_paths(char **paths, int n, int min, int max, int size, float angle, float aspect, float hue, float saturation, float exposure, int center)
box_label *read_boxes(char *filename, int *n)
void randomize_boxes(box_label *b, int n)
void correct_boxes(box_label *boxes, int n, float dx, float dy, float sx, float sy, int flip)
void fill_truth_swag(char *pa