muscima.cropobject module¶
This module implements a Python representation of the CropObject,
the basic unit of annotation. See the CropObject
documentation.
-
class
muscima.cropobject.
CropObject
(objid, clsname, top, left, width, height, outlinks=None, inlinks=None, mask=None, uid=None, data=None)[source]¶ Bases:
object
One annotated object.
The CropObject represents one instance of an annotation. It implements the following attributes:
objid
: the unique number of the given annotation instance in the set of annotations encoded in the containing CropObjectList.uid
: the global unique identifier of the annotation instance. String. SeeCropObject.parse_uid()
method for format details.clsname
: the name of the label that was given to the annotation (this is the human-readable string such asnotehead-full
).top
: the vertical dimension (row) of the upper left corner pixel.left
: the horizontal dimension (column) of the upper left corner pixel.bottom
: the vertical dimension (row) of the lower right corner pixel + 1, so that you can index the corresponding image rows usingimg[c.top:c.bottom]
.right
: the horizontal dimension (row) of the lower right corner pixel + 1, so that you can index the corresponding image columns usingimg[:, c.left:c.right]
.width
: the amount of rows that the CropObject spans.height
: the amount of columns that the CropObject spans.mask
: a binary (0/1) numpy array that denotes the area within the CropObject’s bounding box (specified bytop
,left
,height
andwidth
) that the CropObject actually occupies. If the mask isNone
, the object is understood to occupy the entire bounding box.data
: a dictionary that can be empty, or can contain anything. It is generated from the optional<Data>
element of a CropObject.
Constructing a simple CropObject that consists of the “b”-like flat music notation symbol (never mind the
uid
for now):>>> top = 10 >>> left = 15 >>> height = 10 >>> width = 4 >>> mask = numpy.array([[1, 1, 0, 0], ... [1, 0, 0, 0], ... [1, 0, 0, 0], ... [1, 0, 0, 0], ... [1, 0, 1, 1], ... [1, 1, 1, 1], ... [1, 0, 0, 1], ... [1, 0, 1, 1], ... [1, 1, 1, 0], ... [0, 1, 0, 0]]) >>> clsname = 'flat' >>> uid = 'MUSCIMA++_1.0___muscima.cropobject.CropObject.doctest___0' >>> c = CropObject(objid=0, clsname=clsname, ... top=top, left=left, height=height, width=width, ... inlinks=[], outlinks=[], ... mask=mask, ... uid=uid)
CropObjects can also form graphs, using the following attributes:
outlinks
: Outgoing edges. A list of integers; it is assumed they are validobjid
within the same global/doc namespace.inlinks
: Incoming edges. A list of integers; it is assumed they are validobjid
within the same global/doc namespace.
So far, CropObject graphs do not support multiple relationship types.
Unique identification
The
uid
serves to identify the CropObject uniquely, at least within the MUSCIMA dataset system. (We anticipate further versions of the dataset, and need to plan for that.)To uniquely identify a CropObject, we need three “levels”:
- The “global”, dataset-level identification: which dataset is this
CropObject coming from? (For this dataset:
MUSCIMA++_1.0
) - The “local”, document-level identification: which document
(within the given dataset) is this CropObject coming from?
For MUSCIMA++ 1.0, this will usually be a string like
CVC-MUSCIMA_W-35_N-08_D-ideal
, derived from the filename under which the CropObjectList containing the given CropObject is stored. - The within-document identification, which is identical
to the
objid
.
These three components are joined together into one string by a delimiter:
___
The full
uid
of a CropObject then might look like this:MUSCIMA-pp_1.0___CVC-MUSCIMA_W-35_N-08_D-ideal___611
You will need to use UIDs whenever you are combining CropObjects from different documents, and/or datasets. (If you are really combining datasets, make sure you know what you are doing – some annotation instructions may change between versions, so objects of the same class might not exactly correspond to each other…) The dataset and document names are available through appropriate instance attributes:
>>> c.doc 'muscima.cropobject.CropObject.doctest' >>> c.dataset 'MUSCIMA++_1.0'
If you supply no
uid
at initialization time, a default UID will be used:>>> c.default_uid 'MUSCIMA_DEFAULT_DATASET_PLACEHOLDER___default-document___0'
(Don’t abuse the default, though! It’s intended just for transitioning documents without UIDs to those that have them.)
On the other hand, the
objid
is a field intended to uniquely identify a CropObject within the scope of one CropObject list (one annotation document).Caution
The scope of unique identification within MUSCIMA++ is only within a
<CropObjectList>
. Don’t useobjid
to mix CropObjects from multiple files!CropObjects and images
CropObjects and images are not tightly bound. This is because the same object can apply to multiple images: in the case of the CVC-MUSCIMA dataset, for example, the same CropObjects are present both in the full image and in the staff-less image. The limitation here is that CropObjects are based on exact pixels, so in order to retain validity, the images must correspond to each other exactly, as “layers”.
Because CropObjects do not correspond to any given image, there is no facility in the data format to link them to a specific one. You have to take care of matching CropObject annotations to the right images by yourself.
The
CropObject
class implements some interactions with images.To recover the area corresponding to a CropObject c, use:
>>> if c.mask is not None: crop = img[c.top:c.bottom, c.left:c.right] * c.mask >>> if c.mask is None: crop = img[c.top:c.bottom, c.left:c.right]
Because this is clunky, we have implemented the following to get the crop:
>>> crop = c.project_to(img)
And to get the CropObject projected onto the entire image:
>>> crop = c.project_on(img)
Above, note the multiplicative role of the mask: while we typically would expect the mask to be binary, in principle, this is not strictly necessary. You could supply a different mask interpration, such as probabilistic. However, we strongly advise not to misuse this feature unless you have a really good reason; remember that the CropObject is supposed to represent an annotation of a given image. (One possible use for a non-binary mask that we can envision is aggregating multiple annotations of the same image.)
For visualization, there is a more sophisticated method that renders the CropObject as a transparent colored transparent rectangle over an RGB image. (NOTE: this really changes the input image!)
>>> c_obj.render(img) >>> plt.imshow(img); plt.show()
However, CropObject.render() currently does not support rendering the mask.
Disambiguating class names
Since the class names are present through the
clsname
attribute (<MLClassName>
element), matching the list is no longer necessary for general understanding of the file. The MLClassList file serves as a disambiguation tool: there may be multiple annotation projects that use the same names but maybe define them differently and use different guidelines, and their respective MLClassLists allow you to interpret the symbol names correctly, in light of the corresponding set of definitions.Note
In MUSCIMarker, the MLClassList is currently necessary to define how CropObjects are displayed: their color. (All noteheads are red, all barlines are green, etc.) The other function, matching names to
clsid
, has been superseeded by theclsname
CropObject attribute.Merging CropObjects
To merge a list of CropObjects into a new one, you need to:
- Compute the new object’s bounding box:
croobjects_merge_bbox()
- Compute the new object’s mask:
cropobjects_merge_mask()
- Determine the clsid and objid of the new object.
Since objid and clsid of merges may depend on external settings and generally cannot be reliably determined from the merged objects themselves (e.g. the merge of a notehead and a stem should be a new note symbol), you need to supply them externally. However, the bounding box and mask can be determined. The bounding box is computed simply as the smallest bounding box that encompasses all the CropObjects, and the mask is an OR operation over the individual masks (or None, if the CropObjects don’t have masks). Note that the merge cannot deal with a situation where only some of the objects have a mask.
Implementation notes on the mask
The mask is a numpy array that will be saved using run-length encoding. The numpy array is first flattened, then runs of successive 0’s and 1’s are encoded as e.g.
0:10
for a run of 10 zeros.How much space does this take?
Objects tend to be relatively convex, so after flattening, we can expect more or less two runs per row (flattening is done in
C
order). Because each run takes (approximately) 5 characters, each mask takes roughly5 * n_rows
bytes to encode. This makes it efficient for objects wider than 5 pixels, with a compression ratio approximatelyn_cols / 5
. (Also, the numpy array needs to be made C-contiguous for that, which explains theorder='C'
hack inset_mask()
.)-
UID_DEFAULT_DATASET_NAMESPACE
= 'MUSCIMA_DEFAULT_DATASET_PLACEHOLDER'¶
-
UID_DEFAULT_DOCUMENT_NAMESPACE
= 'default-document'¶
-
UID_DELIMITER
= '___'¶
-
bbox_intersection
(bounding_box)[source]¶ Returns the sub-bounding box of this CropObject, relative to its size (so: 0,0 is the CropObject’s upper left corner), that intersects the given bounding box. If the intersection is empty, returns None.
>>> c = CropObject(0, 'test', 10, 100, height=20, width=10) >>> c.bounding_box (10, 100, 30, 110) >>> other_bbox = 20, 100, 40, 105 >>> c.bbox_intersection(other_bbox) (10, 0, 20, 5) >>> containing_bbox = 4, 55, 44, 115 >>> c.bbox_intersection(containing_bbox) (0, 0, 20, 10) >>> contained_bbox = 12, 102, 22, 108 >>> c.bbox_intersection(contained_bbox) (2, 2, 12, 8) >>> non_overlapping_bbox = 0, 0, 3, 3 >>> c.bbox_intersection(non_overlapping_bbox) is None True
-
static
bbox_to_integer_bounds
(ftop, fleft, fbottom, fright)[source]¶ Rounds off the CropObject bounds to the nearest integer so that no area is lost (e.g. bottom and right bounds are rounded up, top and left bounds are rounded down).
Returns the rounded-off integers (top, left, bottom, right) as integers.
>>> CropObject.bbox_to_integer_bounds(44.2, 18.9, 55.1, 92.99) (44, 18, 56, 93) >>> CropObject.bbox_to_integer_bounds(44, 18, 56, 92.99) (44, 18, 56, 93)
-
bottom
¶ Row coordinate 1 beyond bottom right corner, so that indexing in the form
img[c.top:c.bottom]
is possible.
-
bounding_box
¶ The
top, left, bottom, right
tuple of the CropObject’s coordinates.
-
contains
(bounding_box_or_cropobject)[source]¶ Check if this CropObject entirely contains the other bounding box (or, the other cropobject’s bounding box).
-
crop_to_mask
()[source]¶ Crops itself to the minimum bounding box that contains all its pixels, as determined by its mask.
If the mask is all zeros, does not do anything, because at this point, the is_empty check should be invoked anyway in any situation where you care whether the object is empty or not (e.g. delete it after trimming).
>>> mask = numpy.zeros((20, 10)) >>> mask[5:15, 3:8] = 1 >>> c = CropObject(0, 'test', 10, 100, width=10, height=20, mask=mask) >>> c.bounding_box (10, 100, 30, 110) >>> c.crop_to_mask() >>> c.bounding_box (15, 103, 25, 108) >>> c.height, c.width (10, 5)
Assumes integer bounds, which is ensured during CropObject initialization.
-
dataset
¶ Which dataset is this CropObject coming from? For bookkeeping.
-
decode_mask
(mask_string, shape)[source]¶ Decodes a CropObject mask string into a binary numpy array of the given shape.
-
static
decode_mask_bitmap
(mask_string, shape)[source]¶ Decodes the mask array from the encoded form to the 2D numpy array.
-
static
decode_mask_rle
(mask_string, shape)[source]¶ Decodes the mask array from the RLE-encoded form to the 2D numpy array.
-
default_uid
¶ Constructs the default
uid
that the CropObject would have, unless one was supplied at initialization.>>> c.default_uid 'MUSCIMA_DEFAULT_DATASET_PLACEHOLDER___default-document___0'
-
doc
¶ Which document within the dataset is this CropObject coming from? The
_document_namespace
This is important when working with CropObjects from multiple CropObjectList files, especially for properly constructing CropObject graphs, because
inlinks
andoutlinks
use the numericobjids
, which point to CropObjects within the same document.objid
of each CropObject has to be unique within a document.
-
encode_mask
(mask, compress=False, mode='rle')[source]¶ Encode a binary array
mask
as a string, compliant with the CropObject format specification inmuscima.io
.
-
static
encode_mask_bitmap
(mask, compress=False)[source]¶ Encodes the mask array in a compact form. Returns ‘None’ if mask is None. If the mask is not None, uses the following algorithm:
- Flatten the mask (then use width and height of CropObject for reshaping).
- Record as string, with whitespace separator
- Compress string using gz2 (if compress=True) NOT IMPLEMENTED
- Return resulting string
-
static
encode_mask_rle
(mask, compress=False)[source]¶ Encodes the mask array in Run-Length Encoding. Instead of having the bitmap
0 0 1 1 1 0 0 0 1 1
, the RLE encodes the mask as0:2 1:3 0:3 1:2
. This is much more compact.Currently, the rows of the mask are not treated in any special way. The mask just gets flattened and then encoded.
Implementation:
-
get_inlink_objects
(cropobjects)[source]¶ Out of the given
cropobject
list, return a list of those from which this CropObject has inlinks.Can deal with CropObjects from multiple documents.
-
get_outlink_objects
(cropobjects)[source]¶ Out of the given
cropobject
list, return a list of those to which this CropObject has outlinks.Can deal with CropObjects from multiple documents.
-
inlink_uids
¶
-
is_empty
¶ A CropObject is empty if it is composed of zero pixels. This is measured through the mask. CropObjects without a mask are assumed to be non-empty.
-
join
(other)[source]¶ CropObject “addition”: performs an OR on this and the
other
CropObjects’ masks and bounding boxes, and assigns to this CropObject the result. Merges also the inlinks and outlinks.Works only if the document spaces for both CropObjects are the same. (Otherwise changes nothing.)
The
clsname
of theother
is ignored.
-
left
¶ Column coordinate of upper left corner.
-
middle
¶ Returns the integer representation of where the middle of the CropObject lies, as a
(m_vert, m_horz)
tuple.The integers just get rounded down.
-
outlink_uids
¶
-
overlaps
(bounding_box_or_cropobject)[source]¶ Check whether this CropObject overlaps the given bounding box or CropObject.
>>> c = CropObject(0, 'test', 10, 100, height=20, width=10) >>> c.bounding_box (10, 100, 30, 110) >>> c.overlaps((10, 100, 30, 110)) # Exact match True >>> c.overlaps((0, 100, 8, 110)) # Row mismatch False >>> c.overlaps((10, 0, 30, 89)) # Column mismatch False >>> c.overlaps((0, 0, 8, 89)) # Total mismatch False >>> c.overlaps((9, 99, 31, 111)) # Encompasses CropObject True >>> c.overlaps((11, 101, 29, 109)) # Within CropObject True >>> c.overlaps((9, 101, 31, 109)) # Encompass horz., within vert. True >>> c.overlaps((11, 99, 29, 111)) # Encompasses vert., within horz. True >>> c.overlaps((11, 101, 31, 111)) # Corner within: top left True >>> c.overlaps((11, 99, 31, 109)) # Corner within: top right True >>> c.overlaps((9, 101, 29, 111)) # Corner within: bottom left True >>> c.overlaps((9, 99, 29, 109)) # Corner within: bottom right True
-
parse_uid
()[source]¶ Parse the unique identifier of the CropObject. This breaks down the UID into the global namespace, document namespace (ie. CropObjectList name – usually per image), and the numeric ID of the CropObject within one CropObjectList. This numeric ID should always match the
objid
, which acts as the “technical” identifier, since it is known to be an integer and therefore usable for e.g. indexing within the MUSCIMarker annotation app.See
_parse_uid()
for format & test. Compared to_parse_uid()
, this method checks the parsedobject_id
in theuid
against this CropObject’sobjid
, to verify that the UID is really valid for this object.The delimiter is expected to be
___
(kept asCropObject.UID_DELIMITER
)
-
project_on
(img)[source]¶ This function returns only those parts of the input image that correspond to the CropObject and masks out everything else with zeros. The dimension of the returned array is the same as of the input image. This function basically reconstructs the symbol as an indicator function over the pixels of the annotated image.
-
project_to
(img)[source]¶ This function returns the crop of the input image corresponding to the CropObject (incl. masking). Assumes zeros are background.
-
render
(img, alpha=0.3, rgb=(1.0, 0.0, 0.0))[source]¶ Renders itself upon the given image as a rectangle of the given color and transparency. Might help visualization.
Parameters: img – A three-channel image (3-D numpy array, with the last dimension being 3).
-
right
¶ Column coordinate 1 beyond bottom right corner, so that indexing in the form
img[:, c.left:c.right]
is possible.
-
set_mask
(mask)[source]¶ Sets the CropObject’s mask to the given array. Performs some compatibilty checks: size, dtype (converts to
uint8
).
-
set_objid
(objid)[source]¶ Changes the objid and updates the UID with it. Do NOT use this unless you know what you’re doing; changing the objid should be (1) checked against objid conflics within the doc, (2) reflected in the outlinks and inlinks.
-
set_uid
(uid)[source]¶ Assigns the given
uid
to the CropObject. This is the way to do it, do not assign directly tocropobject.uid
! You need to update other things (and perform integrity checks) when changing the unique ID! SeeCropObject
class documentation for information on howuid
attributes work.Do NOT use this function, unless you know what you are doing! You could mess up the integrity of your copy of the dataset, and you’d have to download it again…
-
to_integer_bounds
()[source]¶ Ensures that the CropObject has an integer position and size. (This is important whenever you want to use a mask, and reasonable whenever you do not need sub-pixel resolution…)
-
top
¶ Row coordinate of upper left corner.
-
muscima.cropobject.
bbox_dice
(bbox_this, bbox_other, vertical=False, horizontal=False)[source]¶ Compute the Dice coefficient (intersection over union) for the given two bounding boxes.
Parameters: - vertical – If set, will only return vertical IoU.
- horizontal – If set, will only return horizontal IoU. If both vertical and horizontal are set, will return normal IoU, as if they were both false.
-
muscima.cropobject.
bbox_intersection
(bbox_this, bbox_other)[source]¶ Returns the t, l, b, r coordinates of the sub-bounding box of bbox_this that is also inside bbox_other. If the bounding boxes do not overlap, returns None.
-
muscima.cropobject.
cropobject_distance
(c, d)[source]¶ Computes the distance between two CropObjects. Their minimum vertical and horizontal distances are each taken separately, and the euclidean norm is computed from them.
-
muscima.cropobject.
cropobject_mask_rpf
(cropobject_gt, cropobject_pred)[source]¶ Compute the recall, precision and f-score of the predicted cropobject’s mask against the ground truth cropobject’s mask.
-
muscima.cropobject.
cropobjects_merge
(fr, to, clsname, objid)[source]¶ Merge the given CropObjects with respect to the other. Returns the new CropObject (without modifying any of the inputs).
-
muscima.cropobject.
cropobjects_merge_bbox
(cropobjects)[source]¶ Computes the bounding box of a CropObject that would result from merging the given list of CropObjects.
-
muscima.cropobject.
cropobjects_merge_links
(cropobjects)[source]¶ Collect all inlinks and outlinks of the given set of CropObjects to CropObjects outside of this set. The rationale for this is that these given
cropobjects
will be merged into one, so relationships within the set would become loops and disappear.(Note that this is not sufficient to update the relationships upon a merge, because the affected CropObjects outside the given set will need to have their inlinks/outlinks redirected to the new object.)
Returns: A tuple of lists: (inlinks, outlinks)
-
muscima.cropobject.
cropobjects_merge_mask
(cropobjects, intersection=False)[source]¶ Merges the given list of cropobjects into one. Masks are combined by an OR operation.
>>> c1 = CropObject(0, 'name', 10, 10, 4, 1, mask=numpy.ones((1, 4), dtype='uint8')) >>> c2 = CropObject(1, 'name', 11, 10, 6, 1, mask=numpy.ones((1, 6), dtype='uint8')) >>> c3 = CropObject(2, 'name', 9, 14, 2, 4, mask=numpy.ones((4, 2), dtype='uint8')) >>> c = [c1, c2, c3] >>> m1 = cropobjects_merge_mask(c) >>> m1.shape (4, 6) >>> print(m1) [[0 0 0 0 1 1] [1 1 1 1 1 1] [1 1 1 1 1 1] [0 0 0 0 1 1]]
Mask behavior: if at least one of the cropobjects has a mask, then masking behavior is activated. The masks are combined using OR: any pixel of the resulting merged cropobject that corresponds to a True mask pixel in one of the input cropobjects will get a True mask value, all others (ie. including all intermediate areas) will get a False.
If no input cropobject has a mask, then the resulting cropobject also will not have a mask.
If some cropobjects have masks and some don’t, fails.
Parameters: intersection – Instead of a union, return the mask intersection: only those pixels which are common to all the cropobjects.
-
muscima.cropobject.
cropobjects_merge_multiple
(cropobjects, clsname, objid)[source]¶ Merge multiple cropobjects. Does not modify any of the inputs.
-
muscima.cropobject.
cropobjects_on_canvas
(cropobjects, margin=10)[source]¶ Draws all the given CropObjects onto a zero background. The size of the canvas adapts to the CropObjects, with the given margin.
Also returns the top left corner coordinates w.r.t. CropObjects’ bboxes.
-
muscima.cropobject.
link_cropobjects
(fr, to, check_docname=True)[source]¶ Add a relationship from the
fr
CropObject to theto
CropObject. Modifies the CropObjects in-place.If the objects are already linked, does nothing.
Parameters: check_docname – If set, checks for docname
match and raises a ValueError if the CropObjects come from different documents.
-
muscima.cropobject.
merge_cropobject_lists
(*cropobject_lists)[source]¶ Combines the CropObject lists from different documents into one list, so that inlink/outlink references still work. This is useful only if you want to merge two documents into one (e.g., if your annotators worked on different “layers” of data, and you want to merge these annotations).
This just means shifting the
objid
(and thus inlinks and outlinks). It is assumed the lists pertain to the same image. Uses deepcopy to avoid exposing the original lists to modification through the merged list.Warning
If you are ever exporting the merged list, make sure to set the
uid
for the outputs correctly, if you want to create a new document.Warning
Currently cannot handle precedence edges.
-
muscima.cropobject.
split_cropobject_on_connected_components
(c, next_objid)[source]¶ Split the CropObject into one object per connected component of the mask. All inlinks/outlinks are retained in all the newly created CropObjects, and the old object is not changed. (If there is only one connected component, the object is returned unchanged in a list of length 1.)
An
objid
must be provided at which to start numbering the newly created CropObjects.The
data
attribute is also retained.