I think I am able to develop end-to-end ML projects. However, my question was about building a career in this space, how to move from junior to senior, from senior to being an expert in the field.
Cython is compiled C that uses CPythons objects. If you can distill your algorithm to a full C(ython) implementation, you get CPython objects + C code, which is then compiled with the (appropriate version of the) system compiler.
Good paper, I was wondering what is the state-of-art of using Neural Networks for Text Segmentation, Text Lemmatisation, Part-of-speech Tagging. Morphological approaches is dominant in this space.
I found RSS feeds to be better alternative to read the news beyond the filtering bubble that our social media platforms create.
There was an interesting tool to monitor RSS list of newspapers[1] on HN sometime ago. I wish that this tool [2] is hosted somewhere to easily get notification on my slack without setting it up or managing it. With load of information we face everyday, the idea of monitoring RSS feeds through Slack interface is very interesting.
How about SageMaker, Can we include it in this list. I played with SageMaker sometime ago and it helps you build a whole pipeline to host your models, in addition to host your notebook and bridge the gap between data scientists and data engineers.
Anecdotally, we considered using the hosted versions of Jupyter and Apache Zeppelin that are part of AWS SageMaker and EMR. We couldn't figure out a simple/familiar workflow for keeping the notebooks under version control. So, we agreed to run the notebooks locally, use a familiar Git-based workflow, and interact with the AWS infrastructure through the local notebook instances.
Well, good question. The file format for Jupyter is not ideal for 'code craftsmanship', as pointed out by another comment. There are utilities to strip out some of the metadata from the Jupyter files, such as rendered output and run counters, but that is a trade-off to be decided by your team:
You don't need all of this. All what you need is to request your data from twitter (Your Tweet archive
> https://twitter.com/settings/account). Iterate through the csv file and use tweet_id to unlike, remove or do what you want through their Twitter API.
Source: I have done it before, and it took less time/work than what you have stated.
When you say "all of this", that's only true for the browser-scraping part. You still need to use the API's CreateFavorite and DestroyFavorite calls on old tweets.
(As discussed, purely calling DestroyFavorite won't work on Tweets outside the 3200-tweet-capped API-accessible data store).
With the exception of the data retrieval method, the OP tried this, and suggests that simply having the tweet_id is not enough, if the tweet it corresponds to happens to be old enough (or something) to not be accessible by the API.
I did exactly this a while ago (before deleting my Twitter account for good) and with the id extracted from the downloaded CSV I could delete everything. Perhaps they've changed policies recently.
I wrote a couple of Python scripts to keep your timeline tidy (delete everything from the beginning, then trim and leave only the last N): https://github.com/rinze/obliterate_tweets
I hope like hell that snips succeeds. This IoT stuff really needs to be open source. I don't want to fill my home with sensors that I don't control, and I'm technical enough to deploy some home brew stuff, but there's just not a whole lot out there as far as open source smart devices
edit: Can anyone name some other privacy respecting, non-cloud, open source platforms and devices to work with?
To somewhat answer my own question, snips has a pretty active community which should hopefully be a good entree into the ecosystem of privacy-friendly hardware projects
(discord, twitter, and https://github.com/snipsco/awesome-snips#community-projects)