Francisco Bischoff - 03 Mar 2020

Time Series with Matrix Profile

Build Dev
Linux x86_64 Build Status Build Status
OSX Build Status Build Status
Windows AppVeyor build status AppVeyor build status
Coverage codecov codecov

Overview

R Functions implementing UCR Matrix Profile Algorithm (http://www.cs.ucr.edu/~eamonn/MatrixProfile.html).

This package allows you to use the Matrix Profile concept as a toolkit.

This package provides:

  • Algorithms to build a Matrix Profile: STAMP, STOMP, SCRIMP++, SIMPLE, MSTOMP and VALMOD.
  • Algorithms for MOTIF search for Unidimensional and Multidimensional Matrix Profiles.
  • Algorithm for Chains search for Unidimensional Matrix Profile.
  • Algorithms for Semantic Segmentation (FLUSS) and Weakly Labeled data (SDTS).
  • Algorithm for Salient Subsections detection allowing MDS plotting.
  • Basic plotting for all outputs generated here.
  • Sequencial workflow, see below.
# Basic workflow:
matrix <- tsmp(data, window_size = 30) %>%
  find_motif(n_motifs = 3) %T>%
  plot()

# SDTS still have a unique way to work:
model <- sdts_train(data, labels, windows)
result <- sdts_predict(model, data, round(mean(windows)))

Please refer to the User Manual for more details.

Please be welcome to suggest improvements.

Performance on an Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz using a random walk dataset

set.seed(2018)
data <- cumsum(sample(c(-1, 1), 40000, TRUE))

Current version benchmark

Elapsed Time(s) Data Size Window Size Threads Lang
mpx_par 0.59 40000 1000 8 Rcpp
mpx 1.94 40000 1000 1 Rcpp
stomp_par 38.90 40000 1000 8 R
stomp 85.13 40000 1000 1 R
scrimp 123.07 40000 1000 1 R
stamp_par 925.45 40000 1000 8 R
stamp 3776.86 40000 1000 1 R

Currently available Features

  • STAMP (single and multi-thread versions)
  • STOMP (single and multi-thread versions)
  • STOMPi (On-line version)
  • SCRIMP (single-thread, not for AB-joins yet)
  • Time Series Chains
  • Multivariate STOMP (mSTOMP)
  • Multivariate MOTIF Search (from mSTOMP)
  • Salient Subsequences search for Multidimensional Space
  • Scalable Dictionary learning for Time Series (SDTS) prediction
  • FLUSS (Fast Low-cost Unipotent Semantic Segmentation)
  • FLOSS (Fast Low-cost On-line Unipotent Semantic Segmentation)
  • SiMPle-Fast (Fast Similarity Matrix Profile for Music Analysis and Exploration)
  • Annotation vectors (e.g., Stop-word MOTIF bias, Actionability bias)
  • FLUSS Arc Plot and SiMPle Arc Plot
  • Exact Detection of Variable Length Motifs (VALMOD)
  • MPdist: Matrix Profile Distance
  • Time Series Snippets
  • Subsetting Matrix Profiles (head(), tail(), [, etc.)
  • Misc:
    • MASS v2.0
    • MASS v3.0
    • MASS extensions: ADP (Approximate Distance Profile, with PAA)
    • MASS extensions: WQ (Weighted Query)
    • MASS extensions: QwG (Query with Gap)
    • Fast moving average
    • Fast moving SD

Roadmap

  • Profile-Based Shapelet Discovery
  • GPU-STOMP

Matrix Profile Foundation

Our next step unifying the Matrix Profile implementation in several programming languages.

Visit: Matrix Profile Foundation

Code of Conduct

Please note that the ‘tsmp’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.