आर से पायथन के लिए एक स्विच इसके लायक है? [बन्द है]


31

मैंने अभी 1 साल का डेटा साइंस मास्टर प्रोग्राम पूरा किया है जहाँ हमें आर सिखाया गया था। मैंने पाया कि पायथन अधिक लोकप्रिय है और एआई में एक बड़ा समुदाय है।

क्या यह मेरी स्थिति में किसी के लिए पायथन में जाने के लायक है और यदि हाँ, तो क्यों? क्या अजगर में कोई गेम-चेंजिंग फीचर आर में उपलब्ध नहीं है या यह केवल समुदाय का मामला है?


2
क्या यह एक सार्वजनिक कॉलेज, एक निजी विश्वविद्यालय या कॉर्पोरेट शिक्षा प्रणाली में एक कोर्स था?
मैनुअल रोड्रिगेज

12
You can't switch yourself to Python. You are not talking about a project you already wrote in R and wants to port it to Python, you are simply asking about learning Python (not forgetting R). Is it worth learning Python? Nowdays it is almost impossible to not learn Python if you work with anything related to data handling with a computer...
lvella

1
I'm not sure why this wasn't closed as opinion based, but I'm glad.
Evorlor

1
@ एवर्लर मेरा उत्तर देखें, जो वास्तव में बताता है कि यह प्रश्न भी मुख्य रूप से राय-आधारित उत्तर देगा। इसके अलावा, मैंने इस सवाल को मुख्य रूप से राय के आधार पर बंद करने के लिए मतदान किया, भले ही मैंने एक जवाब भी दिया। इस वेबसाइट में, इस प्रकार के बहुत सारे प्रश्न हैं। मुझे अब यकीन नहीं है कि यह अच्छी बात है या नहीं, लेकिन इस सवाल का मौजूदा जवाब बहुत से लोगों के लिए उपयोगी है।
22

2
यह कैसे बंद नहीं है ?? यह एक विहित राय आधारित प्रश्न है।
1

जवाबों:


60

मैं आपके सवाल का खंडन करना चाहता हूं।

स्विच करने के बारे में मत सोचो, जोड़ने के बारे में सोचो।

डेटा साइंस में आप अजगर या आर के साथ बहुत दूर जा सकते हैं, लेकिन आप दोनों के साथ सबसे दूर जाएंगे।

अजगर और आर बहुत अच्छी तरह से एकीकृत करते हैं, reticulateपैकेज के लिए धन्यवाद । मैं अक्सर r में डेटा को साफ करता हूं क्योंकि यह मेरे लिए आसान है, बेहतर गति से लाभ के लिए एक मॉडल को प्रशिक्षित करें और ggplotएक नोटबुक में सुंदर सभी में r के परिणामों की कल्पना करें !

यदि आप पहले से ही जानते हैं कि आर इसे छोड़ने का कोई अर्थ नहीं है, तो इसका उपयोग करें जहां समझदार और आपके लिए आसान है। लेकिन कई उपयोगों के लिए अजगर को जोड़ना 100% एक अच्छा विचार है।

एक बार जब आप दोनों में सहज महसूस करते हैं, तो आपके पास एक ऐसा वर्कफ़्लो होगा जो आपको आपकी पसंदीदा भाषा में सबसे अच्छा लगता है।


I fully agree on your point of view. Just adding it and try not to abandon your R skill set.
Jens Kohl

3
upvote for reticulate. rstudio even has some support for inspecting python and is in the process of adding more
blues

The only problem with using both is that you make everyone else downstream depend on having R and Python too. While I agree about "adding" to your skillset - I would still keep things pure when writing scripts!
PascalVKooten

reticulate allows R to use Python; similarly, rpy2 allows Python to use R. It's common for programming languages with similar purposes to have some way to talk to each other.
J.G.

Exactly. Don't trade in your tool, but add one to your belt. Now you have more tools, allowing you to solve more problems in a better way.
Mast

28

Of course, this type of questions will also lead to primarily opinion-based answers. Nonetheless, it is possible to enumerate the strengths and weakness of each language, with respect to machine learning, statistics, and data analysis tasks, which I will try to list below.

R

Strengths

  • R was designed and developed for statisticians and data analysts, so it provides, out-of-the-box (that is, they are part of the language itself), features and facilities for statisticians, which are not available in Python, unless you install a related package. For example, the data frame, which Python does not provide, unless you install the famous Python's pandas package. There are other examples like matrices, vectors, etc. In Python, there are also similar data structures, but they are more general, so not specifically targeted for statisticians.

  • There are a lot of statistical libraries.

Weakness

Python

Strengths

  • A lot of people and companies, including Google and Facebook, invest a lot in Python. For example, the main programming language of TensorFlow and PyTorch (two widely used machine learning frameworks) is Python. So, it is very unlikely that Python won't continue to be widely used in machine learning for at least 5-10 more years.

  • The Python community is likely a lot bigger than the R community. In fact, for example, if you look at Tiobe's index, Python is placed 3rd, while R is placed 20th.

  • Python is also widely used outside of the statistics or machine learning communities. For example, it is used for web development (see e.g. the Python frameworks Django or Flask).

  • There are a lot of machine learning libraries (e.g. TensorFlow and PyTorch).

Weakness

  • It does not provide, out-of-the-box, the statistical and data analysis functionalities that R provides, unless you install an appropriate package. This might be a weakness or a strength, depending on your philosophical point of view.

There are other possible advantages and disadvantages of these languages. For example, both languages are dynamic. However, this feature can both be an advantage and a disadvantage (and it is not strictly related to machine learning or statistics), so I did not list it above. I avoided mentioning opinionated language features, such as code readability and learning curve, for obvious reasons (e.g. not all people have the same programming experience).

Conclusion

Python is definitely worth learning if you are studying machine learning or statistics. However, it does not mean that you will not use R anymore. R might still be handier for certain tasks.


3
It seems like the "out of the box" feature set is irrelevant. The relevant thing is the availability of packages that do what you want, no?
Dean MacGregor

1
@DeanMacGregor If you do not have access to the internet, this feature is relevant! Furthermore, if a programming language already provides a feature out of the box, you do not have to lose time looking for it.
nbro

Considering Python is heavily infested on being 'batteries included', its weakness is not one you encounter often. Especially since there are Python installations in use which do have statistical packages included. For data science in particular, Anaconda is quite popular and solves your immediate concern.
Mast

6

I didn't have this choice because I was forced to move from R to Python:

It depends on your environment: When you are embedded in an engineer department, working technical group or something similar than Python is more feasible.

When you are surrounded by scientists and especially statisticians, stay with R.

PS: R offers keras and tensorflow as well though it is implemented under the hood of python. Only very advanced stuff will make you need Python. Though I'm getting more and more used to Python, the synthax in R is easier. And though each package has its own, it is somehow consistent while Python is not.. And ggplot is so strong. Python has a clone (plotnine) but it lacks several (important) features. In principle you can do nearly as much as in R but especially visualization and data wrangling is much easier in R. Thus, the most famous Python library, pandas, is a clone of R.

PSS: Advanced statistics aims definitely at R. Python offers a lot of everyday tools and methods for a data scientist but it will never reach those >13,000 packages R provides. For example, I had to do an inverse regression and python doesn't offer this. In R you can choose between several confidence tests and whether it is linear or nonlinear. The same goes to mixed models: It is implemented in python but it is so basic there I can't realize how this can be sufficient for someone.


4

I would say yes. Python is better than R for most tasks, but R has its niche and you would still want to use it in many circumstances.

Additionally, learning a second language will improve your programming skills.

My own perspective on the strengths of R vs Python is that I would prefer R for a small, single-purpose program involving tables or charts, or exploratory work in the same vein. I would prefer Python for everything else.

  • R is really good for table mashing. If most of what a particular program is going to do is smoosh some tables into different shapes, then R is the thing to pick. Python has tools for this, but R is designed for it and does it better.
  • It's worth switching to R whenever you need to make a chart, because ggplot2 is a masterpiece of API usability and matplotlib is a crawling horror.
  • Python is well designed for general purpose programming. It has a very well designed set of standard data structures, standard libraries, and control flow statements.
  • R is poorly suited for general purpose programming. It doesn't handle tree-structured or graph-structured data well. It has some rules (like being able to look into and modify your parent scope) which are immediately convenient, but when used lead to programs that do are hard to grow, modify, or compose.
  • R also has some straightforwardly bad things in it. These are mostly just historical leftovers like the three different object systems.

To elaborate more on the last point: computer programming done well is lego where you make your own bricks (functions and modules).

Programs are usually modified and repurposed past their original design. As you build them it is useful to think about which parts might be reused, and to build those part in a general way that will let them plug in to the other bricks.

R encourages you to melt all the bricks together.


1

As others have said, it's not a "switch". But is it worth adding Python to your arsenal? I would say certainly. In data science, Python is popular and becoming ever more popular, while R is receding somewhat. And in the fields of machine learning and neural networks, I'd say that Python is the main language now -- I don't think R really comes close here in terms of usage. The reason for all of this is generality. Python is intended as a general programming language, and allows you to easily script all kinds of tasks. If you're staying strictly within a neatly structured statistical world, R is great, but with AI you often end up having to do novel, miscellaneous things, and I don't think R can beat Python at that. And because of this, I think Python and its packages will be receiving more support and development when it comes to the more cutting-edge tech.


0

This is totally my personal opinion.

I read in my office (at a construction site) that "There is a right tool for every task."

I expect me to face a variety of tasks, as a programmer. I want as many tools as I can "buy or invest in", as possible. One day one tool will help me solve it, some other day some other tool. R (for statistics) and Python (for in general) are two tools I definitely want with me and I think it is worth investment for me.

As far as switch is concerned, I will use the most efficient tool I know (where efficiency is measured by client's requirement, time and cost investment and ease of coding) . The more tools I know, the merrier! Of course there is a practical limit to it.

All this is my personal opinion and not necessarily correct.


0

It sounds like you have invested 1 year for data science with R, and embedded into R environment, but want to explore python for data science.

First learn the basics of the python like how lists and tuple works and how classes and objects work.

Then get your hands dirty with some libraries like numpy matplotlib pandas. Learn tensorflow or keras and then go for data science.


-1

Person who chases two rabbits catches neither

And yes, Python is more popular. I work in both but, business speaking, it's easy to find a job on Python than in R.

So, you could:

  • Pick Python because it is more popular. However, you must start from scratch.

Or

  • Stay with R, after all, you have one year worth of training with R. But it is not popular.

The suggestion here that learning an additional programming language will somehow leave you worse off is nonsense. Learning additional programming languages, especially those that are unfamiliar, will always improve your skills as a programmer in any language.
Will Da Silva
हमारी साइट का प्रयोग करके, आप स्वीकार करते हैं कि आपने हमारी Cookie Policy और निजता नीति को पढ़ और समझा लिया है।
Licensed under cc by-sa 3.0 with attribution required.