internship

ParticleB Research Program

ParticleB

ParticleB is aiming to apply scientific research in AI and data analytics to products using a problem design mindset. We are currently focused on developing an automated trading platform tailored specifically to optimize risk-anchored portfolios. Natural language processing, exploratory data analysis, recommendation systems and data storytelling are also a few domains in which we have ongoing projects.

Project Title: Information Extraction from Market Price Pattern Reports

In technical analysis, transitions between rising and falling trends are often signaled by price patterns. By definition, a price pattern is a recognizable configuration of price movement that is identified using a series of trend lines and/or curves. When a price pattern signals a change in trend direction, it is known as a reversal pattern; a continuation pattern occurs when the trend continues in its existing direction following a brief pause. Technical analysts have long used price patterns to examine current movements and forecast future market movements. Several domain experts monitor market behavior and publish their analysis on price patterns online. In this project, we aim at extracting price patterns from reports alongside their associated parameters. Aggregation on reported patterns is then used to signal market prediction based on pattern definition.

Technical Phases
  1. Review price pattern: describe each pattern using a set of parameters and inferredactions
  2. Data preparation: Scrap technical reports from available data sources
  3. Document Classification: extract related price patterns in each document
  4. Information Extraction: Extract pre defined parameters, timeframe, sentiment andpredicted market behavior from reports
  5. Aggregation: Aggregate extracted patterns for each market-timeframe and producemarket predictions
  6. Backtest: calculate prediction and historical price correlation
Data Sources
  • TradingView: Ideas’ section
  • Telegram Channels (technical reports: Alireza has a list of viable channels)
  • We need to find more data sources
Schedule
Requirements
  • Eager to learn
  • Interested in machine learning theory and applications
  • Working knowledge in Statistics and Linear algebra
  • Experience in at least one project (not a toy-project) based on machine learning and pattern recognition
  • Programming skills (Preferably in Python or R)
  • Having a good Knowledge in Machine Learning concepts (Feature space, Generalization, Classification, Clustering, …)
  • Experience in Financial concepts and NLP is a plus
How to Apply

Send your resume to [email protected]. We will get in touch.

Previous internship 1398

During the summer and autumn of 1398, we were hosting three young, joyful and highly talentedinterns. Each intern had a specific project which was defined based on our real needs. Ourinternship program was scheduled to cover these steps:

  • Literature review
  • Problem definition
  • Problem modeling
  • Initialization
  • Implementation
  • Evaluation
  • Integration
  • Publication

Having been assigned to a specific real project and working with other team members, our internsdeveloped several skills:

  • How to carry out a research process and apply the results to a real project
  • Teamwork and communication skills
  • How to use version control software (git) for research and development
  • Bash script development
  • Parallel computing
  • How to use multiple GPUs and distributed training
  • Using Docker for service deployment

Zahra, Arad and Preni, our interns, did a fantastic job in learning, problem solving, dealing withissues and having fun! All of their projects had so much novelty that they were published in LERC2020. The project titles and associated publications:

  • Irony Detection in Persian Language: A Transfer Learning Approach Using Emoji (publication)
  • Optimizing Annotation Effort Using Active Learning Strategies: A Sentiment Analysis CaseStudy in Persian (publication)
  • Twitter Trend Extraction: A Graph-based Approach for Tweet and Hashtag Ranking,Utilizing No-Hashtag Tweets (publication)
Previous internship 1399

During the summer and autumn of 1399, we were hosting three young, joyful and highly talented interns, each assigned to a fintech based project.

Investigating and Optimizing Hyper-Parameter Effects in ModelConstruction

As you know, any machine learning model is actually a set of parameters and a formulation touse them. These parameters are estimated through a process called training. There are alsosome parameters that are fixed before training and are usually determined based on the natureof the problem and its complexity (i.e. number of hidden layers in an MLP). These parameters,aka hyper-parameters, are usually hard to set and it requires experience and domain knowledgeto get them right. Optimizing these parameters are also challenging, since you need a completetraining process to find if a parameter set works. This project aims at proposing a parameteroptimization framework to overcome the issues stated above.

Identification and Clustering of Significant Traders

Crypto-currency exchanges provide trader information and it’s possible to track (based on ananonymous id) the traders’ behaviour. In this project we want to develop a framework for trackingand clustering traders. To do so, we first need to come up with a criteria for clustering sequentialactions. The output of this project is expected to provide useful insights for our portfoliomanagement and algorithm design teams.

Feature Space Transformation Preserving Time-Dependant Information

Any classic machine learning pipeline contains a feature engineering and extraction phase. This step is critical to get the best out of modeling power since it defines the feature space and its representation power. Dimension reduction is one way to decrease the complexity of feature space while maintaining its information. In this project we are going to build a dimension reduction algorithm specifically for time series data to both reduce the feature space complexity and also preserve short and long term dependencies and information.