About Me

I am regarded as an statistical expert in stunning supranatural solutions, focused on time series, smoothing and big data

Introduction

Mahdi Imani

Statistician • Programmer

I am regarded as an statistical expert in stunning supranatural solutions, focused on time series, smoothing and big data More Info


Download My CV
Language Skills

Persian: Native
English: Fluent (TOEFL iBT, GRE General)

What I do

Do you have extra work that a freelance statistician could help you with?

Are you looking to hire a freelancer that’s proficient in theoretical or applied statistics?

Are you overloaded with information and need a statistician to make sense of it for you?

Are you an online entrepreneur? surely you realize the importance of hiring an efficient Web programmer for building your online storefront. Do you know even the most ambitious and intelligent business owner will almost never be able to create a website which is well functioning and professional looking enough to ensure their success?

Look no further for a Statistician or Programmer expert. I’ll quickly understand your needs and deliver a powerful, intuitive solution. Stuck on a frustrating problem? I’ll fix it. just read my reviews!

Statistician

I apply my knowledge of statistical methods to a variety of subject areas, such as biology, economics, engineering, medicine, public health, psychology, marketing, education, and sports. Many economic, social, political, and military decisions cannot be made without statistical techniques, such as the design of experiments to gain Federal approval of a newly manufactured drug. Statistics might be needed to show whether the seemingly good results of a drug were likely because of the drug rather than just the effect of random variation in patient outcomes.

Programmer

I, as a web site programmer, offer a comprehensive list of services to meet all your web design and development needs. From domain booking / registration to website hosting, from custom web site designing of HTML or Flash sites, multimedia presentations, portals, vortals to maintenance and backend services. I can provide virtually anything you can possibly imagine. I draw upon my creative resources and employ some of the most prominent Design Softwares in the world, including Macromedia Dreamweaver, Flash, CorelDraw & PhotoShop.

Database Administrator

I, as a database administrator (DBA), am responsible for the installation, configuration, upgrading, administration, monitoring, maintenance, and security of databases in an organization. My role include the development and design of database strategies, system monitoring and improving database performance and capacity, and planning for future expansion requirements. I can also plan, co-ordinate and implement security measures to safeguard the database.

Statistical Skills

Statistical Analysis
Questionnaire Design
SAS Program
R Program
SPSS Program

Programming Skills

HTML5
PHP
jQuery & Javascript
CSS3
MySQL & PgSQL

Education & Work

I am regarded as an statistical expert in stunning supranatural solutions, focused on time series, smoothing and big data

Education

Master of Science
Tehran University of Medical Sciences
2009-2012

Tehran, Iran
Major emphasis: Biostatistics
Thesis Title: "Comparison of the Accuracy of Smoothing Methods for Forecasting Time Series of Work Related Mortality Rates for Registered insured in SSIO Between 2000 and 2011"
Supervisor: Professor Masoud Salehi

Bachelor of Science
University of Tabriz
2003-2008

Tabriz, Iran
Major emphasis: Statistics

Research Focus: Data Analysis
Analysis and inference of data using statistical programs SAS, SPSS, Minitab, and R. Data mining and questionnaire design to gather observations in a epertize way. Analysis and forcasting time series of some manufactures data to predict number of required production lines.

Pre-University
Shahid Rajaei Pre-Uni. Center
2000-2001

Tehran, Iran
Major emphasis: Mathematical Physics






High School
Shahid Motahari high School
1997-2000

Tehran, Iran
Major emphasis: Mathematical Physics






Work Experience

Market Developer & Researcher
Statistical Predictor
2014-Present

Tehran, Iran
Work as a Market R & D in Sar-E-Nakh Textiles Co.
I, as a statistician who have worked on time series analysis, am working as a market researcher and developer in the company. My main duty is to gather all helpful data on market to assess and forecast the main way of the company activities in the future. It can include short- or long-term checkpoints.

Freelancer
Statistician and Programmer
2013-2015

Tehran, Iran
Work as a freelancer in fields of Statistics, Web programming, Big Data Database designing and Data mining.

Statistician
Small Projects
2006-2012

Tehran, Iran
Participation as a statistician in research teams for doing small statistical projects, statistical analyst, and advisor for statistics students.

M.Sc. Thesis Abstract

My thesis entitled “Comparison of the Accuracy of Smoothing Methods for Forecasting Time Series of Work Related Mortality Rates for Registered insured in SSIO Between 2000 and 2011” in which I could learn much about Time Series and Smoothing Methods. The objective of this study was to model, estimate and predict time series of death rate and total number of occupational accidents. In this project, I used Two Kernel and Spline non-parametric regression methods to find the best interpolation and estimation of the series missing values as well as two prediction methods of Exponential and Box-Jenkins time series to find the best prediction on series through calculation of sum of absolute errors have been compared.

Abstract
General Concepts and purpose: Controlling occurrence of accidents in work place has been an interesting subject in all countries worldwide. Financial consequences of these accidents and their economic losses imposed on the involved componies, is only one of the insignificant aspect of such damages and when the non-economic, but intangible losses to the society such as loss of human lives and its impact on life of survivals are taken into consideration these economic damages will be marginalized. Purpose of this study is fitting the best possible model to time series of death rate and total number of accidents caused in work place through comparison of smoothing methods in order to predicting these series, and estimate the series’ missing values during fitting of these models. There are few methods for estimation of missing data in seasonal time series. This research tries to calculate the best estimations for making the best prediction on the series.
Materials and Methods: This study intends to model, estimate and predict time series of death rate and total number of occupational accidents for the insured people by Iranian Social Security Organization between 2000 and 2010. Two Kernel and Spline non-parametric regression methods to find the best interpolation and estimation of the series missing values as well as two prediction methods of Exponential and Box-Jenkins time series to find the best prediction on series through calculation of sum of absolute errors have been compared.
Results: For work place death rate time series, two kernel smoothing method are recommended for the points’ interpolation and exponential smoothing for prediction of the series. Given the comparison criterion, Mean Square Forecast Error for exponential smoothing method was found to be equal to 8.91 and for box-jenkins smoothing method equal to 8.85. with regard to the time series of total number of work-place accidents, spline method for points’ interpolation and box-jenkins analysis for prediction of time series have shown better performance. Mean Square Forecast Error has been found to be equal to 218858 for exponential smoothing method and equal to 25712 for box-jenkins smoothing method.
Conclusion: When time series have a simple trend and lack seasonal changes, the both interpolation methods and the both prediction methods act approximately identically. Thus, the methods with simpler calculations and proportionally less costly, seems suitable for such situations. The two kernel and exponential methods have had far less calculations and require fewer presumptions relative to other two methods, and therefore, in such condition, their use is more appropriate. In more complex series, when there is a non-linear trend with seasonal changes, spline method and box-jenkins analysis will have a better performance. These two methods involve costly calculations and before fitting, they require to examine more presumptions compared to other two similar methods.

Key words:
Occupational Accidents, Time Series, Smoothing Methods, Estimation of Missing Observations

Research & Publication

I am regarded as an statistical expert in stunning supranatural solutions, focused on time series, smoothing and big data

Publication

All
All
Journal Papers
Conferences
Demonstrations
Theses
Book Chapters
Books
15
Oct
2015
Work­related accidents among the Iranian population: a time series analysis, 2000–2011.

M., Karimlou, M., Salehi, M., Imani, et al.

International Journal of Occupational and Environmental Health
Oct 2015. 21(4): p. 279-284. Read More

18
Dec
2012
Bayesian modeling of work-related accidents in Iran: 2009.

M., Salehi, M., Imani, et al.

Journal of Health Administration
2012. 51(16): p. 30-42. [In Persian] Read More

15
Apr
2012
Forecasting number of work­related injuries time series with Box­Jenkins Models for registered insured in SSIO between 2000 and 2010 in Iran.

M., Imani, M., Salehi, et al.

Razi Journal of Medical Sciences
2012. 19(100): p. 12-21. [In Persian] Read More

1
Feb
2012
Comparison of the Accuracy of Smoothing Methods for Forecasting Time Series of Work Related Mortality Rates for Registered insured in SSIO Between 2000 and 2011.

M., Imani - Supervisor: M. salehi

M.Sc. Thesis, Tehrtan University of Medical Sciences
2012. [In Persian] Read More

Current Research

Interpolating Missing Values Using Spline Smoothing: A Big Data Approach in Multivariate Time series

Almost every area needs to collect time related information, resulting in time series consisting of observations collected at regular time intervals. Technological developments have meant that it is now easier than ever before to collect large quantities of time series data in many new areas. Consider for example the large influx of data arriving every minute to databases collected by wearable technology. This is just one of many examples where huge amounts of data are being collected and stored in accessible databases. There is a simple rule about data: When the quantity of data increases, quality will decrease, often resulting in large quantities of missing data. Especially when the data gathering is being performed by untrained people or machines, missing data is a concern. Big Missing Data is therefore a new problem that statisticians need to address seriously.

Missing data in time series data is usually manifest as large number of consecutive observations going missing – due to machine failure or other such causes. The goal of this research is to impute the missing data in a time series by considering other time series which show some correlation with this series. With big time series data bases there will be numerous correlated series that can be used to provide a priori information for the imputation of the missing values in other series. Determination of such correlated series is done by clustering and classification methods using optimal similarity measures in combination with characteristic-based clustering algorithms. Various spline smoothing method are used for imputing different types of missing data for correlated series. As the last step, the fitting of spline curves is finetuned using methods such as anomaly detection and motif discovery to find the best estimated values for missing data.

Two new research approaches will be used in this research: big data in correlated time series and multivariate spline smoothing methods for imputing missing values. These approaches will be applied with a range of simulated and real databases, allowing thorough testing in a variety of statistical contexts and resulting in the publication of several research articles. The research will also produce a statistical software package that will be made available to the R community.
Read More

Research Interests

One of my minor research areas is survival analysis. It is about the time duration until one or more events happen, such as death in biological organisms and failure in mechanical systems. Survival analysis attempts to answer questions such as: what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?

The most attractive aspect of this area for me is when I tried to estimating parameters using nonparametric methods.

One of the major research areas I am interested to it. Clinical trials are experiments done in clinical research. Such prospective biomedical or behavioral research studies on human participants are designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, dietary choices, dietary supplements, and medical devices) and known interventions that warrant further study and comparison. Clinical trials generate data on safety and efficacy.

Clinical trials need a very innovative mind to design and perform experiments cleverly since in the most of the time it deals with very critical investigations about drugs and patients and therefore, it needs very accurate and sharp-sighted approach.

Playing with factors and their effects on a particular thing is one the my hobbies and clinical researches can be considered as a pinpoint answer to this fascination.

The world is becoming more and more instrumented, interconnected and intelligent, resulting in mountains of newly generated data. With storage costs coming down significantly, companies now want to leverage this instrument-generated data (including meter, temperature and all types of sensor data over time) for conducting analysis. Among all the types of big data, data from sensors is the most widespread and is referred to as time series data.

Time series with big data approach is my main area of research. When the data is big, we need two optimal functions in two different scientific areas: computer sciences and Statistics. In fact we need to make balanced optimization between these two sides: amount of I/O process and Statistical accuracy. This is the most interesting issue I am working on it.

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. Clustering methods has changed to an interested area recent years. There are over 100 published various clustering algorithms. Clustering algorithms can be categorized based on their cluster model.

Among numerous clustering methods, I want to study those that deal with huge datasets. In fact, I am working on clustering methods when the data is too big for using simple methods since regarding the amount of I/O Process, we have to optimize accuracy of clustering and usage of computer resources simultaneously.

Forecasting time series is my previous research interest particularly when I was working on my MSc thesis. But what I am interested in now is control of time series. Controlling time series methods differ according to the area in which they are used. The attractive area for me is when the data is so big that we need to create a balance between loss function and amount of I/O process.

Database administration is one of the most attractive subjects for me. Especially when it combines with statistical fields. My recent researches is on time series analysis as one of the statistical areas. Therefore it is plausible if my study lies between database administration and time series.

When data is big, because it is time-stamped, time series data has a special internal structure that differs from relational data. Additionally, many applications such as smart metering store data at frequent intervals that require massive storage capacity. For these reasons, it is not sufficient to manage time series information using the traditional relational approach of storing one row for each time series entry. Doing so creates performance challenges as the data volumes grow exponentially. The solution is Informix TimeSeries.

There are plentiful algorithms for estimation of missing values in a dataset. When data is in the form of time series, pattern of existing values is more important for interpolating missing values than other form of data. Typically, each series has about 15 characteristics including trend and seasonal properties which should be considered in interpolating missing values. In addition, when number of series is more than one and the series are correlated, interpolating methods (e.g. EM algorithm) have less functionality because of ignorance of correlation. My focus is on the estimation of such datasets using other nonparametric methods with optimal interpolation regarding series characteristics.

Multivariate nonparametric methods is one of the most interesting fields to me. Unfortunately, I have not had the chance to work on the subject yet. But I hope to enter to this area soon.

Smoothing is a very vast scientific area from time series analysis to image processing algorithms. I had this chance to work on kernel and spline smoothing methods when the data is time-stamped. now I am ready to expand usage of spline methods when the data is more complicated.

References

“During his MSc program, I have played different roles as his supervisor for the master dissertation, and his teacher in different courses such as Multivariate data analysis and Categorical data analysis. Given his past history of study and research, I am confident that he will provide visions on research problems, technical skills to solve problems and to prepare evaluations and enthusiastically hard work in a highly diverse environment. Multi dimensionality of his knowledge (mathematics, computer and statistics) and his smart vision of solving problems accompany with his hard work are the most academic strength of Mahdi.”

Masoud Salehi

Assistant Professor of Biostatistics

“Mahdi has excellent skills critical to a successful career in research, such as tenacity, excellent problem solving, and critical thinking. These skills have helped him to collaborate with other researchers resulting in publishing of papers in peer-reviewed journals. Mahdi is a leader and beyond his passion on driving performance, he is consciously proactive at getting full involvement of all other team members to derive the best results possible. Mahdi has just the right combination of assertiveness and respect to make him a joy to work with, and I know that he got along well with everyone and his colleagues commented favorably about working with him.”

MahmoudReza Gohari

Associate Professor of Biostatistics

“Mahdi was an undergraduate student in the statistics program at University of Tabriz. He is highly intelligent, works well as a team member, and has demonstrated leadership potential. His programming skills is awesome and when it combines with his statistical knowledge, it leads to a good functionality for data sciences. He has the intellectual capacity and he has the ambition. Based on what I have seen of him in the classroom, he also has the drive.”

Hossein Jabbari Khamnei

Associate Professor of Statistics

Blog

I am regarded as an statistical expert in stunning supranatural solutions, focused on time series, smoothing and big data

Big Data in Healthcare - Part 1

Posted By Mahdi Imani, 0 comments

This is Part 1 of a six part series that discusses how to leverage big data in healthcare. This educational video discusses the benefits of analyzing clinical and business outcomes to improve patient care.

02 april 2016
1
Big Data for Health

Posted By Mahdi Imani, 0 comments

This paper provides an overview of recent developments in big data in the context of biomedical and health informatics. It outlines the key characteristics of big data and how medical and health informatics, translational bioinformatics, sensor informatics, and imaging informatics will benefit from an integrated approach of piecing together different aspects of personalized information from a diverse range of data sources, both structured and unstructured, covering genomics, proteomics, metabolomics, as well as imaging, clinical diagnosis, and long-term continuous physiological sensing of an individual.

02 april 2016
0
Data Stream Clustering by Divide and Conquer Approach Based on Vector Model

Posted By Mahdi Imani, 0 comments

Recently, many researchers have focused on data stream processing as an efficient method for extracting knowledge from big data. Data stream clustering is an unsupervised approach that is employed for huge data. The continuous effort on data stream clustering method has one common goal which is to achieve an accurate clustering algorithm.

02 april 2016
1
A Data Mining Framework to Analyze Road Accident Data

Posted By Mahdi Imani, 0 comments

One of the key objectives in accident data analysis to identify the main factors associated with a road and traffic accident. However, heterogeneous nature of road accident data makes the analysis task difficult. Data segmentation has been used widely to overcome this heterogeneity of the accident data.

03 december 2015
2
Work-related Accidents Among the Iranian Population: A Time Series Analysis, 2000-2011

Posted By Mahdi Imani, 0 comments

Background: Work-related accidents result in human suffering and economic losses and are considered as a major health problem worldwide, especially in the economically developing world.

28 november 2015
1
Bayesian Model for Work-related Accidents in Iran: 2009

Posted By Mahdi Imani, 0 comments

Introduction: It is of prime importance to consider the pattern and geographical changes of a disease, in each community independently, to determine high and low risk areas. Mapping diseases is a set of statistical methods which attempt to provide precise maps by which the geographical distribution of a disease is estimated. In this study, Bayesian methods were applied to estimate the relative death rate of work-related accidents in Iran.

27 november 2015
0
Forecasting Number of Work-related Injuries Time Series with Box-Jenkins Models for Registered Insured in SSIO Between 2000 and 2010 in Iran

Posted By Mahdi Imani, 0 comments

Background: Controlling occurrence of accidents in work place has been an interesting subject in all countries worldwide. Financial consequences of these accidents and their economic losses imposed on the involved companies is only one of the insignificant aspects of such damages and when the non-economic but intangible losses to the society are taken into consideration, these economic damages will be marginalized. Purpose of this research is fitting the box-Jenkins model to time series of total number of accidents in work place and estimation of series' missing values during fitting of this model.

27 november 2015
1
Everything must be made as simple as possible. But not simpler.

Albert Einstein

19 november 2015
2
Quantum Theory Reveals Puzzling Pattern in How People Respond to Some Surveys

Posted By Mahdi Imani, 0 comments

Researchers used quantum theory - usually invoked to describe the actions of subatomic particles - to identify an unexpected and strange pattern in how people respond to survey questions.

19 november 2015
1
A Simple Solution for Big Data: New Algorithm Simplifies the Categorization of Data

Posted By Mahdi Imani, 0 comments

Experts use the expression big data to indicate huge amounts of information, such as those (photos, videos, texts, but also other more technical types of data) shared at any time by billions of people on computers, smartphones and other electronic devices. The present-day scenario offers unprecedented perspectives: tracking flu epidemics, monitoring road traffic in real time, or handling the emergency of natural disasters, for example. For us to be able to use these huge amounts of data, we have to understand them and before that we need to categorize them in an effective, fast and automatic manner.

18 november 2015
1

Calendar & Evenets

I am regarded as an statistical expert in stunning supranatural solutions, focused on time series, smoothing and big data

Events Calendar

Something is not working! Please try again...

Events List

Something is not working! Please try again...

Contact

I am regarded as an statistical expert in stunning supranatural solutions, focused on time series, smoothing and big data

Contact Information

Email:

Call: (+98) 912 345 3358

Location: Tehran, Iran

Website: www.imani.pro

Quick Contact

Quick contact form... (Short messages)