We often use ssh (Secure SHELL) to connect server via terminal remotely. If you are tired entering password to login the server, just do the following steps to enter passwordless.
Read more...
Last year, someone broke our car’s window parked inside the parking and took some stuff, I though, it is good idea to look at the rubbing data in Montreal where we have been living.
Read more...
If you need to create the variable name using the loop object, use exec
:
for i in range(4):
exec(f'var_{i} = [range(i)]')
# to save session
dill.dump_session('backup_2021_10_22.db')
# to load
backup_restore = dill.load_session('backup_2021_10_22.db')
import numpy as np
x = np.array([[5,2,1,3], [2,1,5]])
fun = lambda t: np.argmax(t)
np.array([fun(xi) for xi in x])
To show how fit the multiple regression using R and Python, we consider the car data [car] which has the car specifications; HwyMPG, Model, etc,. We fit different regression models to predict Highway MPG using the car specifications.
Read more...
The logistics regression is used when the response variable is a binary variable, such as (0 or 1), (live or die), and (fail or succeed).
Read more...
TensorFlow is a module developed to achieve the machine learning models. It is develped based on manipulating tensors, which are actually multidimensional array. It supports the hardware acceleration (GPU), which makes it suitable for machine learning model that need alot of computation.
Read more...
In this short, we show how run several classification methods; XGBoost, Ada Boost, Discriminant Analysis, KNN, random forest, decision theory, Gaussian Process, Logistics regression, Gaussian Mixture Classification, SVM, and LSTM.
Read more...
When we do not have a closed form, we can use the optimization to find estimate of parameters. In order to find optimization, you need a loos function and use a procedure to minimize the value of loss function based on the parameter values.
Read more...
Python is equipped with strong tools for the repeat of some commands or produce sequence number.
Read more...
In the context of programming, a function is a sequence of statements that performs a computation. Functions has three parts; argument, script, and output. Python has two kinds of function: built-in function that is in the core of Python or are collected as package. User-defined function that is written by user.
Read more...
Python provides a variety of useful data structures, such as lists, sets, and dictionaries, and a new structure define by programmer which called class.
Read more...
To see the type, the information and summary of variables in the data-frame, use .dtypes
, .describe()
, and .info()
.
Pipeline in Pandas allows to build a sequence of function to run in order on data-frame.
Read more...
A column can easily be added to data-frame
df0=pd.DataFrame([38,40,25,33])
df['Ave_hour']=df0
Panada is very useful for merging dataset; to merging data consider the following data sets, where ‘id1’ and ‘id2’ include the ids of data.
Read more...
To select the data use the name of variable, or specify the indices via .iloc
and .loc
(link)[http://pandas.pydata.org/pandas-docs/version/0.22/indexing.html]. .iloc
is an integer-based select and should be used with integer indies. On contrary, .loc
is primarily label based, and may also be used with a boolean array.
Data-frame via pandas is very useful format for working with dataset, its structure is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). The following codes create a data-frame from a dictionary.
Read more...
To show how generate the cross tabulate, let us categorize the columns; consider two continuous variables ( e.g., housing_median_age and total_rooms), categorize them according their .3 and .7 quantiles, and label the elements as L, M, and H. Then find the cross tabulate of them,
Read more...
Using df.apply(fun)
can apply a function on columns or row:
df.apply(np.sum, axis=0)
df.apply(np.sum, axis=1)
Keras is a deep learning library written in Python which is running on top of TensorFlow, CNTK, or Theano.
To show how fit the multiple regression using R and Python, we consider the car data [car] which has the car specifications; HwyMPG, Model, etc,. We fit different regreesion models to predict Highway MPG using the car specifications.
Read more...
This text is a self-study to learn how use SQL to work with the databases.
Read more...
A brief cheatsheet for those who use R daily, and use other script programing languages, Python and Matlab, occasionally.
Read more...