Logo

Python Lectures - Shared screen with speaker view
Christopher Graham Akey
13:07
How can I get the customer credit data set?
Zachary Qian
13:52
download the repository from github too
TA-Jackie Ayoub
14:41
Yep this is another method
Abir Sarkar
16:49
it shows pd not defined
Tannistha Mondal
17:01
import pandas
TA-Jackie Ayoub
17:16
import pandas as pd
TA-Jackie Ayoub
17:33
Try to run what I sent first
Abir Sarkar
17:45
thanks
Esther T. Adegoke
17:59
How do you round to nearest integer? I tried round(0) but it still shows the tenth place.
TA-Mathi
18:43
just try round()
TA-Mathi
18:51
without any arguments in it
Andrew Zheng
31:58
.round( ) with no arguments still gives 1 decimal on my end here
TA-Mathi
32:14
pandas round() or round(0) always returns with one decimal point as 0. In order to remove that you might want to convert it to int type. Because round always returns a float
Andrew Zheng
32:51
I see, thanks
TA-Mathi
35:12
Sorry I think I need to add one more detail. Pandas round() returns integer but not when combined with describe. As the datatype returned by describe() is float
TA-Mathi
35:49
credit.round() will return integers but credit.describe.round() will have .0 as the decimal point which is due to the return type of the describe() method
TA-Mathi
37:47
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.describe.html this can give more detailed info if needed
TA-Mathi
37:58
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.round.html
Andrew Zheng
55:00
when attempting to plot, I get the output [<matplotlib.lines.Line2D at 0x7fd42d180278>] without any plot being displayed
Andrew Zheng
55:09
I wonder if I need to update matplotlib
TA-Mathi
55:51
No it does not need any update
TA-Jackie Ayoub
55:58
pyplot.show()
TA-Jackie Ayoub
56:03
Can you try this
TA-Mathi
56:04
Can you share the lines of code in your cell?
TA-Jackie Ayoub
56:14
That isn't an error. That has created a plot object but you need to show the window. That's done using pyplot.show()
TA-Jackie Ayoub
56:38
Or try show()
TA-Mathi
56:48
yeah you can try plt.show()
TA-Jackie Ayoub
57:07
also you can try this : import matplotlib.pyplot as plt
Andrew Zheng
57:12
.show( ) worked
TA-Jackie Ayoub
57:20
good
Andrew Zheng
57:20
forgot to add that part
Andrew Zheng
57:22
lol thanks
Andrew Zheng
01:13:02
did the creators of the comics develop the .xkcd( ) method
Zachary Qian
01:15:05
lmaooooo
Andrew Zheng
01:40:03
I closed and reopened the tab and it fixed everything
Andrew Zheng
01:40:05
:D
TA-Mathi
01:40:29
Oh :D that's sometimes a crazy solution for most of the issues!
Esther T. Adegoke
01:42:41
What if we want to separate the plots?
Andrew Zheng
01:43:16
are outliers incorporated into or excluded from the calculation of each quartile to generate the boxplot?
TA-Mathi
01:46:23
@Andrew: I believe they are excluded from the quartile and are plotted as individual points in boxplots
TA-Mathi
01:46:43
you can check the 'whis' part in this link: https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.axes.Axes.boxplot.html
TA-Mathi
01:47:20
@Esther: What do you mean by separate plots? Are you asking about plotting each category in different plots?
TA-Jackie Ayoub
01:47:46
I believe she wants each level of x in a boxplot
TA-Jackie Ayoub
01:48:25
I dont see how we can do that directly using the function
TA-Jackie Ayoub
01:49:07
Probably we can separate the levels before plotting and then we can specify each level in x
TA-Mathi
01:49:13
Oh I see. We can do it only by filtering the data itself to the respective x category and plot it
TA-Jackie Ayoub
01:49:20
yep
Esther T. Adegoke
01:50:06
I was asking how can we graph the boxplot by itself without adding it to the stripplot. Sorry for the confusion.
TA-Mathi
01:51:06
Oh I see
TA-Mathi
01:51:23
You can but it won't show the data points as you see in strip plot
TA-Mathi
01:51:35
But rather only a static box in the data range
TA-Mathi
01:52:00
Generally boxplot is used in combination with a stripplot
Esther T. Adegoke
01:52:32
Thank you!
TA-Mathi
01:52:39
no problem
Christopher Graham Akey
01:55:22
sns.heatmap(data=credit.corr(),cmap="Blues",amount=True) gave me an error
Andrew Zheng
01:55:43
annot=True
Andrew Zheng
01:55:48
not amount I believe
Christopher Graham Akey
01:56:18
Thanks Andrew! :)
Andrew Zheng
02:09:08
ImportError: cannot import name 'get_dummies'
Andrew Zheng
02:09:30
maybe there’s some pre-workshop step I neglected by accident
TA-Mathi
02:09:47
Seems your pip need an upgrade
TA-Mathi
02:10:16
may be the version you have does not match with the recent one
Andrew Zheng
02:10:26
I upgraded pip a half hour ago when i was dealing with the previous issue
Zachary Qian
02:10:47
you can also try conda
Zachary Qian
02:11:19
i.e. conda install ____ on terminal
Andrew Zheng
02:17:11
I ended up upgrading a bunch of dependent packages, and uninstalling and reinstalling statsmodels
Andrew Zheng
02:17:18
things look normal nw
Andrew Zheng
02:17:19
now
Andrew Zheng
02:17:30
thanks everyone for the suggestions
Julian Bernado
02:17:52
Income, Limit, Rating was used for the first model but Income, Rating, Education was used for the second
Zachary Qian
02:27:52
when you do train test split, how do you adjust for sparsity? since we have an already imbalanced label set, how do we adjust this?
TA-Jackie Ayoub
02:29:31
Usually you need to randomize the data if you cant collect more data
Joshua’s iPad
02:49:45
Thank you!!!
Alejandra Solis Sala
02:49:46
Thank you!
Madeline Peterson
02:49:47
Thank you!
Jamie Forschmiedt (she/her)
02:49:48
Thank you!!
Claire Kong
02:49:48
Thank you!!
Ian M.
02:49:48
thank you !
Rami Shams
02:49:49
thank you
Ananya Iyer
02:49:50
Thank you!
Madeline Powers
02:49:50
Thank you!