Install PySpark on Linux

You probably have heard about it, wherever there is a talk about big data the name eventually comes up. In layman’s words Apache Spark is a large-scale data processing engine. Apache Spark provides various APIs for services to perform big data processing on it’s engine. PySpark is the Python API, exposing Spark programming model to Python applications. In my previous blog post, I talked about how set it up on Windows in my previous post. This time, we shall do it on Red Hat Enterprise Linux 8 or 7. You can follow along with free AWS EC2 instance, your hypervisor (VirtualBox, VMWare, Hyper-V, etc.) or a container on almost any Linux distribution. Commands we discuss below might slightly change from one distribution to the next.

Like most of my blog posts, my objective is to write a comprehensive post on real world end to end configuration, rather than talking about just one step. On Red Hat 7, I ran into a problem. I solved this problem without having to solve it.

The Author


  • Linux (I am using Red Hat Enterprise Linux 8 and 7)
  • Java
  • Hadoop
  • Spark
  • Anaconda or pip based virtual python environment

I have broken out the process into steps. Please feel free to skip a section as you deem appropriate.

Table of Contents

Continue reading “Install PySpark on Linux”

Windows: configure VS Code integrated bash shell for Anaconda

So you’re / you’ve-been using Python in Windows. You know your way around setting up PATH variable so that you type “python” in your command prompt and it works. Now, say that you want to use Anaconda Python in bash. Let’s go one step further and say, you want to use the bash from your Visual Studio Code integrated shell. The process isn’t too different. There doesn’t seem to exist a guide, which covers all these together – hence this post.

My goal is to show you one of the possible ways to configure your development environment quickly – to you get you going in no time.

At the end you should have the following:

  • Bash shell working with python and,
  • Visual studio shell integration (optional)

Continue reading “Windows: configure VS Code integrated bash shell for Anaconda”

Ubuntu: deploy .NET Core app

Today, I want to walk-through the steps I used to deploy ASP.NET Core website application to Ubuntu Server. ASP.NET Core supports several Linux distributions, I am using Ubuntu Server.

From a quick internet search I found 3 decent blog posts:

  1. article
  2. article
  3. post

There already are many articles which talk about how to set up your development environment for .NET Core but this post starts, where they end. It’s about getting production ready.

These are to-the-point & well written. But, either these posts are more than a year old, or they are for setting up your development environment, not for production deployment. You need to install .NET Core run-time, not the .NET Core SDK (which also includes run-time). You can download it from: Continue reading “Ubuntu: deploy .NET Core app”

Install OpenCV: ~10 min!


  1. Tested on: Windows 7, Windows 10
  2. There could be other ways to get Open CV working on your Windows computer

Before installation

  1. Find you OS architecture
  2. Decide the Python version (e.g. 2.7 or 3.x)
  3. Install Anaconda, open terminal as administrator and , Continue reading “Install OpenCV: ~10 min!”

Read/Write Large XML Files

Up to certain extent the performance quite depends on .NET version your application is running on.

Another quick reference is Microsoft Patterns and Practices article

There’re 4 ways to read and write XML

Continue reading “Read/Write Large XML Files”

Getting Started: Python

I personally think that Visual Studio is the best IDE, and I spend most of my day in it. And I’d definitely by using Visual Studio as much as I can –  if you are on Windows, Visual Studio’s Python support is worth exploring.

I’d leave the choice up to you. Both IDE have free and paid options.

I think Visual Studio is an engineering marvel. If as much effort went into space travel as went into the design of this, we’d be on Mars by now.

That said if you  prefer to use PyCharm:

  • Please go to and get your free full blown tools offered by Jet Brains, with all bells and whistles.
  • Here are two YouTube video series if you want:

Getting Started with PyCharm:

PyCharm Video Demos:





If you want to use Visual Studio: