diff options
Diffstat (limited to 'programming')
24 files changed, 0 insertions, 2174 deletions
diff --git a/programming/a_plan_against_the_current_web.md b/programming/a_plan_against_the_current_web.md deleted file mode 100644 index 55e58da4..00000000 --- a/programming/a_plan_against_the_current_web.md +++ /dev/null @@ -1,19 +0,0 @@ -# A plan against the current web - -Browsers are controlled by Google, and to a lesser extent, Apple. -We have arrived to this because browsers have become excessively complex, so that even Microsoft has decided that it's better to be subject to the whims of Google, than to maintain one. -This complexity is derived from using the web for delivering applications, where its initial purpose was browsing content. - -## Part I: Make content websites simple again - -See [the content web manifesto](the-content-web-manifesto). - -## Part II: Application distribution sucks - -We use so many web applications nowadays because there are no alternative platforms to distribute applications with the same reach and convenience. - -As a way to start this discussion, let's propose making Flatpak applications work on Windows and macOS, and make them installable and executable from a web link (like Java Web Start or ClickOnce from Microsoft). -And additionally, let's produce "starter packs" for as many programming languages as possible, so creating these applications is "easy". - -Besides all the implementation work, what would be the downsides to this? -I believe that this would offer better performance than webapps (Flatpak applications on Linux are consistently faster than web apps and Electron apps), and Flatpak apps can already be implemented using many programming languages (webapps are halfway there through WASM, but not there yet). diff --git a/programming/about_apis.md b/programming/about_apis.md deleted file mode 100644 index 2c44f41e..00000000 --- a/programming/about_apis.md +++ /dev/null @@ -1,31 +0,0 @@ -# About APIs - -The [Jeff Bezos' API memo](https://gist.github.com/kislayverma/d48b84db1ac5d737715e8319bd4dd368) is one of the most talked stories about API programming. - -It is, in my opinion, also one of those things which are successful for some environments, but not for all. - -## The levels of API accessibility - -An "operation" in your application can be in one of the following levels of "API accessibility": - -* -oo The operation cannot be invoked in isolation easily. -For instance, it is embedded in an MVC controller, mixed with form handling and HTML generation, and thus the best approach to invoke it programatically is to simulate a browser -* 0 The operation can be invoked, in-process, by calling a function or method, but requiring complex setup or using complex types (e.g. others than lists, maps, numbers and strings) -* 1 The operation can be invoked, in-process, by calling a function without complex setup and using plain types -* 2 The operation can be invoked, off-process, by calling a function without complex setup and using plain types -* 3 The operation can be invoked via a command line tool -* 4 The operation can be invoked via a network call - -Many proponents of APIs propose level 4 as the target. -This obviously allows your operations to be integrated in separate processes via network calls, which is the most powerful way of API access. -They will also reason that this will force your application to have a clean architecture with separation of concerns. - -Note also that doing proper testing will probably force your operations to be tested to be in levels 0-2, as otherwise it will be annoyingly complex to test them. - -We propose that the architecture benefits of level 4 are also present in levels 0-3, but achieving these levels requires much less effort than achieving level 4 (where you need to add a network protocol, handle aspects such as authentication/authorization, marshalling/unmarshalling, etc.), so unless you require level 4, you can stay in levels 0-3. -Going to level 3 instead of 0 should be easy when creating new operations, so that's the level of API accessibility we recommend new code to adhere to by default. - -Note also that level 3 can provide many benefits of level 4, but with less development overhead, so it's a level we recommend considering explicitly, as it is often overlooked. - -Level -oo is typical of legacy applications. -Note that we consider the distance between level -oo and the rest of levels much bigger than the distance between the rest of levels. diff --git a/programming/about_relational_databases.md b/programming/about_relational_databases.md deleted file mode 100644 index 464de330..00000000 --- a/programming/about_relational_databases.md +++ /dev/null @@ -1,30 +0,0 @@ -# About relational databases - -## What is a relation? - -A common misconception is that the "relations" in a relational database are about relations between database tables. - -Actually, the relations in a relational database are the tables. - -A relation "relates" a set of values with another set of values. - -For example, a relation can relate the name of a person with their birth date and birth place. -For example: - -``` -(person name) => (birth date, birth place) -(Alice) => (1979-12-03, Barcelona) -(Bob) => (1995-03-04, Paris) -... -``` - -Many computer languages have similar concepts: - -* [Python mapping types such as `dict`](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict) -* C++ `std::map` -* Java `java.util.Map` -* [C# `System.Collections.Generic.Dictionary`](https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.dictionary-2?view=net-9.0) -* [Javascript `Object`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object) -* [PHP arrays](https://www.php.net/manual/en/language.types.array.php) - -Relations are a natural concept, so although non-relational data systems exist, most data can be stored as relations. diff --git a/programming/containers_might_not_be_the_right_answer.md b/programming/containers_might_not_be_the_right_answer.md deleted file mode 100644 index b00bbade..00000000 --- a/programming/containers_might_not_be_the_right_answer.md +++ /dev/null @@ -1,114 +0,0 @@ -# Containers might not be the right answer - -Containers are everywhere, and I feel today they are the default answer to many problems for many people. - -[Although the author of one of the handiest quotes does not want people to use it](https://www.jwz.org/blog/2014/05/so-this-happened/), I think that quote adequately describes the situation. -Definitely, containers are no silver bullet. - -Containers are a good example of an “easy but not simple” technique (see [the "Simple Made Easy" - Rich Hickey (2011) talk](https://www.youtube.com/watch?v=SxdOUGdseq4)). - -Containers are easy because they automate getting arbitrary "isolated" Linux environments and running processes on them. -Additionally, you can find container images for mostly everything on the Internet. -For this reason, it is very easy to run much software using containers, often with a single command. - -However, containers are easy, but not simple. -Containers combine many different techniques to achieve their ease, and thus, frequently you hit problems derived from any of the techniques containers use. - -However, Docker popularized many good ideas, both directly related to containerization and general ideas! -There are still places where containers are the right answer. - -## Reasons not to use containers - -### Containers are Linux - -The fact that containers are a Linux technology is the most common point of complexity. - -One result of this is that using containers on any operating system that is not Linux, such as Windows or macOS, requires a Linux virtual machine. - -This has some accidental problems, like increased memory and CPU usage, and derived inconveniences like decreased battery life and fan noise. - -But the main issue is that adding this VM makes things more complex, mostly because you are adding networking to the mix, and some container features are not available or work worse, like bind mounts. - -Most issues can be worked around, but this requires more effort, or at least, more knowledge overhead of knowing how to avoid these issues. - -(On top of that, people who develop processes using Linux are not exposed to these issues, so they are likely to introduce issues without realizing what works on their Linux workstation does not work on macOS or Windows workstations.) - -### Container images are big and expensive - -Optimizing the size of container images can require significant effort. -Popular public images are often optimized for size, but even with optimized images, storing and moving container images frequently requires much more bandwidth and storage than the alternatives. - -There are free ways to host private container images, but they are frequently limited in size, bandwidth, or both. -You can easily run into Docker Hub limits, GitHub only provides 2gb of storage, etc. - -Building container images also can require significant resources. - -### Containers require specialized knowledge - -Using containers frequently requires learning quite a few things specific about containers. - -* `Containerfile` design is not obvious. - Some questions, like `ADD` and `COPY`, or `CMD` and `ENTRYPOINT` are difficult and not well documented. - -* Container design is not obvious. - Docker popularized "application containers", a fuzzy concept that is related to "single process containers", 12 factor architecture, and a few other ideas. - Solving your problem might require good knowledge of application container design and use. - -* Container tools are complex, because containerization is difficult. - Likely you need to know some intricate details of how Linux file permissions and users work, for example. - -Not using containers can mean avoiding having to think about these things, and being able to use the time you save to actually solve your problem. - -### Docker is not so good, targeting multiple container engines is not trivial - -Docker was the first popular container engine. -Docker was a revolution and invented or popularized many great ideas. -However, knowledge about containers was not well established when Docker was invented, and since then, better ways of doing many things have been discovered. - -Other tools such as Podman or Fedora Toolbx and adjacent tools such as Distrobox have introduced many improvements respect Docker, while still reusing and being compatible with many Docker concepts. - -However, creating processes and tools across these different tools can be difficult, despite their apparent compatibility. - -### In some scenarios, containers do not add much - -Mainly after the rise of the Go programming language, distributing binaries has become easier. -Distributing binaries on Windows and macOS has always been simpler than distributing binaries on Linux. - -However, nowadays many programming languages can create binaries that can be downloaded and executed on most Linux distributions. - -One of the main benefits of Docker has been ease of distribution of software, but nowadays this is easy to achieve through binaries. - -### Beware container images - -Much software is distributed nowadays as container images. -The abundance of container images means that learning how to use containers helps you run a wide variety of software distributed as a container image. - -However, many container images are not of great quality, nor are adequately updated. - -In some cases, you can find software that has a container image, but where the container image is not of sufficiently good quality and can cause issues down the road. - -## Reasons to use containers - -### Containers still provide isolation easily - -By default, running a container does not alter the system that runs the container significantly. - -This is a huge advantage, and in many cases, more difficult to accomplish without containers. - -### Making things work across different operating systems is not trivial either - -Some decisions in software, like programming language or dependency choices, influence greatly how easy running the software on different operating systems is. - -In many cases, making something run in a specific distribution and packaging as a container can be the most resource-efficient way to distribute software that works across Windows, macOS, and most Linux distributions. - -Finding the right combination that makes software portable can require significant effort, or even be unviable. - -### Some container-related software has good and unique ideas - -For example, the controversial Kubernetes still provides a distributed standardized operating system that can be managed in a declarative way. -This is a powerful concept, and still the preferred way to package software for Kubernetes depends on container images. - -## What to use instead of container images - -* Binaries -* "Portable-friendly" development tools such as Go, `uv`, or Cargo. diff --git a/programming/crud_is_an_important_unsolved_problem.md b/programming/crud_is_an_important_unsolved_problem.md deleted file mode 100644 index 88497f7e..00000000 --- a/programming/crud_is_an_important_unsolved_problem.md +++ /dev/null @@ -1,92 +0,0 @@ -# CRUD is an important unsolved problem - -The CRUD (initials of create, replace, update, and delete) term is used to describe the implementation of applications that provide a simple interface to a database. - -If you have used a computer for work, then you likely have used a system that allows to manipulate records of customers, orders, products, or any other information related to the business. - -Although programmers have been written a huge amount of CRUD systems for decades, my perception is that the costs of implementing CRUD systems is a major problem with important consequences. - -There are two major approaches to implementing CRUD systems: - -* Traditional programming: combining a relational database and most existing programming languages enables programmers to create CRUD systems. - -* "No code" (or "low code"): many products and services enable non-programmers to describe their data structure and the user interface requiring less technical knowledge than the traditional programming approach. - -## About implementing CRUD systems with traditional programming - -The Python Django web framework coupled with a relational database requires writing the least code of all the platforms I have used to write CRUD systems. - -Most of what you need to do when using Django is to describe what you need, instead of implementing the mechanical parts of a CRUD system. - -Out of the box, Django provides: - -* List and detail views, including nested views. - Many systems provide "flat" details view where you can edit a record, but not associated records. - For example, they provide a detail view for customers where you can edit the customer name and other information, but any "multiple" information, such as multiple addresses or phone numbers, must be edited in a different view. - This is frequently a huge issue, and it can require writing a significant amount of code in other systems. - With Django, you can implement this by describing the associated data. - -* Multi user authentication and role-based authentication. - With Django and without programming any code, administrators can create groups, assign users to groups, and limit the kinds of records that each group can view or edit. - -* Primitive change tracking. - Out of the box, changes to records are tracked automatically and can be consulted. - -For most CRUD implementations, alternative platforms require significantly more effort to implement those features. - -Additionally, the entire stack is open source software that does not require paying licenses. - -(Surprisingly, in the past there existed even more sophisticated CRUD platforms. - But sadly, most have disappeared.) - -## About implementing CRUD systems with no code - -A huge amount of systems provide similar functionality, in an even more friendly manner. - -They typically provide a user interface where you can create tables, add columns, and describe the user interface without programming. - -Some of those systems offer features comparable or superior to Django. - -However, because those systems focus on no code usage, frequently you hit roadblocks when using them. - -When you need a feature that they do not provide, it is either impossible to do it, or it requires programming in an unfriendly environment. - -Programming CRUD features can be complex. -While traditional programming tools have evolved providing many features such as automated testing and advanced code revision control systems (rolling back bad changes and others), no code CRUD platforms do not reuse the wealth of programming tools that have been developed for traditional programming. - -Non-developers frequently face huge challenges going beyond the basics of what the tool provides, and developers struggle and suffer by working in environments that are more limiting compared to others. - -## The consequences of the high cost of development of CRUD systems - -In these conditions, most CRUD systems are expensive and do not work well. - -Organizations often resort to systems such as spreadsheets that can be productive, but have severe reliability concerns. - -No code CRUD systems often have significant costs and lock in their customers, because migrating costs can be astronomical. - -CRUD systems implemented with traditional programming often are costly to maintain and extend. - -In most cases, organizations cannot justify the costs of tailoring the CRUD system entirely to their needs, so they suffer from using CRUD systems that do not meet their needs. - -## Possible approaches - -### Improving existing traditional programming CRUD platforms - -I believe systems such as Django can still see many improvements. -Likely, both the amount of technical knowledge to use these systems and the amount of effort to design CRUD systems can be reduced significantly. - -### Providing systems to transition from no code approaches to traditional programming - -No code approaches are wonderful, because giving end users the ability to describe what they need enables them to experiment and become productive very quickly. - -However, no code platforms cannot provide all features needed, and in many cases, end users will struggle past a certain point. - -Providing a way to migrate to a traditional programming approach would enable breaking this barrier and scaling systems more effectively. - -(Some no code platforms have APIs. - With them, programmers can write code to extend the no code CRUD systems using traditional programming approaches. - However, implementing functionalities through APIs has limitations and specific problems.) - -## Further reading - -* [About Django](python/about_django.md). diff --git a/programming/git/combining_repos_with_josh_filter.md b/programming/git/combining_repos_with_josh_filter.md deleted file mode 100644 index d63506a3..00000000 --- a/programming/git/combining_repos_with_josh_filter.md +++ /dev/null @@ -1,252 +0,0 @@ -# Combining repos with josh-filter - -## Introduction - -When writing complex software, developers frequently consider incorporating code from other repositories into their repository. -This choice is controversial and this document will not discuss whether incorporating other repositories is the best option. - -Developers have different options to perform this task, including: - -* Copying the code -* Using Git submodules -* Using Git subtree -* Using repository "orchestration" tools such as Google's repo - -A recent option is Josh. -Josh is a daemon that can serve virtual Git repositories that apply transformations to repositories. -With Josh, you can create a virtual Git repository that combines multiple repositories, or a virtual repository that contains a subtree of another repository. - -Because Josh is a daemon, using Josh has a greater overhead than other options to combine code from other repositories. -However, Josh provides the josh-filter command that can be used for similar purposes without the extra maintenance of a service. - -This document describes a sample scenario and a step-by-step procedure that you can follow along to learn about josh-filter. - -## Scenario - -A development team maintains the `foo` repository. -The code in the `foo` repository uses external code from the `bar` repository. -The team needs to make changes to code in the `bar` repository, but they have not found a convenient procedure to do so. -Ideally, the team would like to synchronize their changes with the `bar` repository, so that they can benefit from `bar` updates, and contribute back. - -## Preparing the example - -Create an `example` directory to contain all the files required for this example. - -To follow this example, you will need two repositories standing for the `foo` and `bar` repositories. -You can use any repository, but the example assumes that the repos are called `foo` and `bar`, and that the `bar` repository will be copied to the `external/bar` path in the `foo` repository. - -The example works with two local mirrored repositories in Git, so you can simulate pushing and pulling from a Git provider such as GitHub. -The example uses `foo.git` and `bar.git` as the URLs of the two repositories. -You can replace the URLs with real repository URLs, or you can create these repositories by mirroring real repositories. -If you use local mirrors, then you can also simulate pushing and pulling without affecting real repositories. - -To mirror two repositories locally: - -``` -git clone --mirror $URL_TO_SOME_REPO foo.git -git clone --mirror $URL_TO_SOME_REPO bar.git -``` - -(The example also assumes that you are using branches named `main` in both repositories.) - -You also need the josh-filter tool. -Follow [the installation instructions](https://josh-project.github.io/josh/reference/cli.html#josh-filter). - -## Walkthrough - -### Incorporate a repository - -Start by cloning the `foo` repository: - -``` -git clone foo.git -``` - -This command clones the `foo.git` repository to the `foo` directory. - -Change to the `foo` clone. - -``` -cd foo -``` - -Create and switch to a `incorporate-bar` branch. - -``` -git switch -c incorporate-bar -``` - -Fetch the `main` branch of the `bar` repository. -The `get fetch` command creates a `FETCH_HEAD` reference that contains the `main` branch. -This command is a convenient way to work with multiple repositories. - -``` -git fetch ../bar.git/ main -``` - -Use the `josh-filter` command to incorporate the code in the `FETCH_HEAD` reference to the `external/bar` path. -This command takes the `HEAD` reference as an input, and filters another reference using a josh filter. - -The following command takes the `FETCH_HEAD` reference that contains the `bar` code. -The `:prefix=external/bar` moves all content of the `FETCH_HEAD` reference to the `external/bar` path. -The result is stored in a new `FILTERED_HEAD` reference. - -``` -josh-filter ':prefix=external/bar' FETCH_HEAD -``` - -Merge the `FILTERED_HEAD` reference into your current branch. -Because the `FILTERED_HEAD` reference contains the `bar` code and is unrelated to the current branch in the `foo` repo, you need the `--allow-unrelated` option. - -``` -git merge --allow-unrelated FILTERED_HEAD -``` - -After this command, `ls external/bar` and `git log external/bar` show the contents and history of the `bar` repository. -`git log` and tools such as `gitk` will show the combined history of the two repositories. - -Push the branch to the `foo.git` remote. - -``` -git push --set-upstream origin incorporate-bar -``` - -If you were working with a real repository, then you could create, review, and merge a pull request by following the usual procedures. -If you are using mirrored repositories, then change to the main branch and merge the `incorporate-bar` branch. - -``` -git switch main -git merge --no-ff incorporate-bar -git push -``` - -(`git merge --no-ff` is equivalent to the "create a merge commit" button in PRs for GitHub.) - -At this point, the `main` branch in the `foo.git` repository contains the code from the `main` branch in the `bar.git` repository in the `external/bar` path. -The code has the full history, and changes can be contributed to and from the `bar.git` repository. - -### Incorporating upstream changes from the `bar.git` repository - -If new changes are pushed to the `bar.git` repository, then you can pull those changes into the copy in the `foo.git` repository. - -#### Simulating changes in the `bar.git` repository - -Change to the `example` directory. - -Clone the `bar.git` repository. - -``` -git clone bar.git -``` - -Change to the `bar` directory and make some changes. - -Push the changes to `bar.git`. - -``` -git push -``` - -#### Incorporate the changes - -Change to the `example/foo` directory. - -Create and switch to a `pull-bar` branch. - -``` -git switch -c pull-bar -``` - -Fetch the changes again. - -``` -git fetch ../bar.git/ main -``` - -If you run the `git log FETCH_HEAD` changes, then you can verify that the changes you made are in `FETCH_HEAD`. - -Filter, merge, and push the changes again. - -``` -josh-filter ':prefix=external/bar' FETCH_HEAD -git merge --allow-unrelated FILTERED_HEAD -git push --set-upstream origin pull-bar -``` - -After these commands, the `pull-bar` branch contains the new changes from `bar.git`. - -If you were working with a real repository, then you could create, review, and merge a pull request by following the usual procedures. -If you are using mirrored repositories, then change to the main branch and merge the `pull-bar` branch. - -``` -git switch main -git merge --no-ff pull-bar -git push -``` - -### Upstreaming changes in `foo.git` to `bar.git` - -If you make changes to the `external/bar` directory in the `foo.git` repository, you can contribute these changes back to the `bar.git` repository. - -#### Simulating changes in the `foo.git` repository - -Change to the `example` directory and to the `foo` directory. - -Make some changes to the `external/bar` directory. - -Push the changes to the `foo.git` repository. - -``` -git push -``` - -### Upstreaming the changes - -Change to the `example` directory. - -Change to the `bar` directory. - -``` -cd bar -``` - -Create and switch to a `upstream-from-foo` branch. - -``` -git switch -c upstream-from-foo -``` - -Fetch the `main` branch from the `foo.git` repository. - -``` -git fetch ../foo.git/ main -``` - -Apply the opposite filter. -The `:/external/bar` filter puts the contents of the `external/bar` directory in the root of the repository. - -``` -josh-filter ':/external/bar' FETCH_HEAD -``` - -``` -git merge --allow-unrelated FILTERED_HEAD -``` - -After these commands, the `upstream-from-foo` branch in the `bar.git` repository contains the upstream changes from `foo.git`. - -If you were working with a real repository, then you could create, review, and merge a pull request by usual procedures. -If you are using mirrored repositories, then change to the main branch and merge the `upstream-from-foo` branch. - -``` -git switch main -git merge --no-ff upstream-from-foo -git push -``` - -## Further possibilities - -This walkthrough explains how to incorporate code from a repository into a different repository. -Then, you can synchronize further changes to both repositories. - -With similar steps, you can experiment how other Git functionality is affected, such as resolving merge conflicts, or different Git workflows. diff --git a/programming/git/git_advice.md b/programming/git/git_advice.md deleted file mode 100644 index 7dd19818..00000000 --- a/programming/git/git_advice.md +++ /dev/null @@ -1,22 +0,0 @@ -# Git advice - -## Never use `git commit -m`, use `git commit -v` - -Configure your system so that `git commit` opens your preferred editor. - -With `git commit -v` you can see your commit diff while writing your commit message. -This helps you review that your commit is correct and write a better commit message. - -## Use gitignore properly - -See <https://git-scm.com/docs/gitignore>. - -Note that by default, Git defaults to `$XDG_CONFIG_HOME/git/ignore` or `$HOME/.config/git/ignore`. - -## Use the modern Git commands (or teach them) - -Particularly, `git checkout` has many functionalities that now can be handled by more focused commands like `git switch` and `git reset`. - -If you have too much muscle memory and are used to them, then consider learning them only to teach other people so that they start with the safer commands. - -Many Git commands print suggestions that use the newer commands. diff --git a/programming/git/github_annoyances.md b/programming/git/github_annoyances.md deleted file mode 100644 index df373bd7..00000000 --- a/programming/git/github_annoyances.md +++ /dev/null @@ -1,9 +0,0 @@ -# GitHub annoyances - -## The repository creation wizard can be confusing initially - -When creating a new repo, GitHub offers you to populate the repository with some files (a README, a `.gitignore` file, a license). - -In some situations, you have an existing directory in your computer with files that you want to be the initial contents of the repo. -If you create a truly empty repo, then GitHub displays some instructions that can help pushing the contents of your existing directory to the new repo. -If you use the GitHub features to populate the repo, then GitHub does not display these instructions and uploading your files requires more knowledge. diff --git a/programming/java/tutorial.md b/programming/java/tutorial.md deleted file mode 100644 index e4f5f1d1..00000000 --- a/programming/java/tutorial.md +++ /dev/null @@ -1,84 +0,0 @@ -# A Java tutorial - -This tutorial walks through creating a blank Spring Boot application on Pop!_OS 22.04 - -## Set up - -Open a terminal and run: - -``` -$ sudo apt install openjdk-21-jdk code -``` - -and follow the prompts. - -(You can install both packages from the Pop!_Shop application. -However, when installing Visual Studio Code, Pop!_Shop defaults to the Flatpak version of the program that is more troublesome than the .deb package. -Choosing the correct software is easier from the terminal.) - -Open Visual Studio Code from "Show Applications". - -Follow the "Walkthrough: Setup VS Code", but in the "Rich support for all your languages" step, click "Browse Language Extensions" and install "Extension Pack for Java". -Do not install a new JDK; you installed one in a previous step. - -## Creating a Spring Boot application - -In the command palette (ctrl+shift+p), search for and execute "Java: Create Java Project...". - -Select "Spring Boot". - -Install the Initializr plugin if prompted, you might need to restart the Java project creation wizard. - -Select "Maven Project", select the highest Spring version that is not a snapshot. - -Select the Java language. - -Enter a group id. A group should be a domain in reverse order. The default of "com.example" is OK. - -Enter an artifact id. This should be a single keyword. The default of "demo" is OK. - -Select the jar packaging type. - -Select the Java 21 version matching the JDK you installed in a previous step. - -Add only the Spring Web dependency. - -When choosing the folder for the project, Visual Studio Code creates a further folder named like the artifact id you enter in a previous step. -(So do not create a directory with your application name.) - -Choose File, Open Folder in the Visual Studio Code menu, then select the directory that the previous step created named like the artifact id. - -"Trust the authors". - -Navigate to the `src/main/java/com/example/demo/DemoApplication.java" file (the path varies depending on the group id and artifact id). - -Right click on the file and select "Run Java". -If the "Run Java" option is not present, then you might not have "trusted the authors"; in this case, Open Folder again. - -Visual Studio Code displays a terminal. -After a few moments, the terminal displays a message about the application having started. - -Open a browser and navigate to <http://localhost:8080>. -The browser displays an error because the application wizard creates an empty application. - -## Advice for people involved in developing the software used in this tutorial - -### The Eddy Pop_OS! tool should allow interacting with debconf - -When using the Visual Studio Code .deb from Microsoft and not the .deb from Pop!_OS, installing the package from the browser fails. -The Visual Studio Code .deb has a debconf prompt to add Microsoft package repositories. -This locks Eddy. - -This is likely covered by these issues: - -* https://github.com/donadigo/eddy/issues/105 -* https://github.com/donadigo/eddy/issues/107 - -### The Pop!_Shop should not default to the Visual Studio Code Flatpak - -Flatpaks are great most for applications, except for development tools. -Using the Visual Studio Code Flatpak makes configuring development tools harder. - -### Visual Studio Code should not prompt users to download a .tar.gz archive of Java without further instructions - -On Linux distributions where the distribution package manager provides a reasonably recent version of Java, users can install Java through the package manager more easily. diff --git a/programming/mama_quiero_ser_programador.md b/programming/mama_quiero_ser_programador.md deleted file mode 100644 index 7fc8ded3..00000000 --- a/programming/mama_quiero_ser_programador.md +++ /dev/null @@ -1,137 +0,0 @@ -# Mamá, quiero ser programador - -Nuestro primer ordenador llego a casa cuando yo tenía cuatro años. -A mi madre le gusta repetir que aquel día mi padre, mi hermano y yo no comimos. -Desde entonces me han fascinado los ordenadores, lo que me ha llevado a la fascinación por programar. - -Curiosamente, estudié ingeniería informática un poco por casualidad e incluso cuando acabé la carrera, tenía mis dudas de si sería mi carrera profesional. - -Sin embargo, en los últimos tiempos hay mucho interés en los beneficios de trabajar de programador. - -Más recientemente, la fiebre del oro de los LLMs ha sembrado dudas sobre el futuro de la profesión. - -Este texto intenta recoger mis opiniones sobre estos temas. - -## Observaciones sobre el mercado laboral - -Los trabajos de programador parecen reflejar que hay mucho trabajo y poca gente capacitada para hacerlo. -Hay bastantes trabajos comparativamente bien pagados y con buenas condiciones. - -Sin embargo, esto mayormente aplica a los trabajadores con bastante experiencia. -Gente con poca experiencia comenta que encontrar un trabajo requiere un esfuerzo desproporcionado, si se consigue encontrar siquiera trabajo. - -Propongo que esto se debe a que la mayoría de ofertas son para programadores con experiencia y el atractivo de la profesión ha generado un número de candidatos mucho mayor que las ofertas disponibles, creando la situación inversa que la de los programadores con experiencia. - -Además, la profesión de programador tiene la peculiaridad de que muchos profesionales dedican mucho de su tiempo libre a ejercer, más allá del trabajo o los estudios. -Como el proceso de obtener un trabajo tiene elementos de competencia con el resto de candidatos a un puesto, entre programadores con poca experiencia se ha popularizado autoformarse para mejorar sus posibilidades. -Esta medida sólo es efectiva cuando nos permite destacar sobre otros competidores, con lo que cada vez parece necesitarse más esfuerzo de autoformación para competir. - -### La irrupción de los LLMs y la crisis postpandemia - -En 2020, diversos factores generaron un crecimiento del sector mayor de lo habitual. -Sin embargo, a partir de mediados de 2022, se dispararon los despidos en el sector. -<https://layoffs.fyi> recoge cifras de despidos que desde el segundo trimestre de 2022 siempre se han mantenido como mínimo bastante por encima del periodo 2020-2021, con un pico de despidos en el tercer trimestre de 2023. - -Además, en noviembre de 2022, OpenAI lanzó ChatGPT. -Desde entonces, muchos han augurado que los LLMs pueden afectar significativamente al mercado laboral en general y al sector en particular. - -Finalmente, muchos interpretan movimientos políticos, económicos y otras inestabilidades como otra crisis mundial en ciernes. - -## La incertidumbre sobre el sector - -Quienes buscan una carrera laboral y piensen en la programación deberían preguntarse si sigue siendo el sector de la programación tan atractivo como parecía ser hace unos años. - -La respuesta es incierta ahora mismo. - -Los problemas para entrar en el mercado siguen siendo iguales o peores de lo que llevan siendo en los últimos años. - -Además ahora hasta profesionales con experiencia tienen dudas sobre su futuro. - -No hay certezas para predecir el futuro, pero podemos observar el pasado. - -Apostar por la programación antes parece una buena idea a posteriori, pero mucha gente ha abandonado el sector y no todo el mundo ha tenido trabajos buenos y bien pagados. - -[Face it: you're a crazy person](https://www.experimental-history.com/p/face-it-youre-a-crazy-person) es un artículo que propone que escoger una profesión debería basarse en lo atractivo que nos resultan *todas* las partes del trabajo, sobre todo las peores. - -Ya cuando estudiaba, mucha gente se imaginaba divirtiéndose programando videojuegos. - -En mi opinión, algunas de las peores partes de la programación son las prisas; siempre se hace todo con menos tiempo del que querríamos. -Eso influye en que lo que hacemos y lo que usamos suele estar mal documentado o no funciona bien, haciendo que la programación sea menos "construir cosas que sirven de algo" y más "reparar con mil chapuzas cosas ligeramente estropeadas". - -Además, lo otro es que muy probablemente tendremos que dedicar tiempo no remunerado y fuera de nuestra formación en formarnos, en general haciendo cosas que si bien pueden resultar más gratificantes, en general también serán frustrantes. -(Además, de cara a conseguir trabajo, lamentablemente en general también ayudará muchísimo *completar* cosas que podamos poner en nuestro currículum.) - -A otro nivel, los trabajos de programación que pudieran resultar más motivadores y edificantes son por general los peor pagados y con peores condiciones, mientras que los buenos suelen ser en general los que despertarán menos vocación en nadie. -En mi opinión, es complicado conseguir algo de realización en este sector sin sacrificar la mayoría de beneficios que muchos ven en la profesión. - -A corto plazo, mi previsión es que todo esto empeore. -El sector muy probablemente seguirá siendo una opción mucho mejor que la mayoría, pero creo que las expectativas laborales deberán rebajarse. -El único consejo que se me ocurre es intentar construir cosas similares a las que vemos en el mundo real para ver si nos gusta realmente el trabajo. - -## Consiguiendo un trabajo - -Los procesos de contratación son una parte proporcionalmente muy pequeña de la vida laboral pero que concentran gran parte de lo que se habla y se protesta en este sector. - -Tengo más de dos décadas de experiencia profesional, creo que he tenido muy buenos trabajos y generalmente voy a puestos con menos competencia de lo normal. -Pero para encontrar trabajo, en ocasiones he tenido que presentarme a más de un centenar de ofertas y llevarme innumerables rechazos de todo tipo, silenciosos y sonoros. - -Hay estudios que parecen demostrar que una parte muy importante de ofertas en el sector incluso son totalmente ficticias. -(Esto seguramente afecte a otros sectores, pero parece especialmente popular en este.) - -Los procesos de selección de personal tienen una gran parte de competencia porque en general, siempre hay otros candidatos que se esfuerzan como nosotros en ser los elegidos. - -### Fuentes de ofertas - -Aunque creo que las grandes plataformas de empleo son menos efectivas que otras vías para encontrar trabajo, sí vale la pena examinar las ofertas para saber qué demanda el mercado y de paso apuntarse a todas las ofertas que podamos. -Esto último igual hasta nos sirve para entrar en algún proceso y quizá conseguir trabajo, pero también es importante porque los procesos de selección requieren práctica real para mejorar nuestras posibilidades. - -Es importante recordar que muy frecuentemente lo que parecen requisitos en estas ofertas de empleo no lo son. -Si una empresa pide más conocimientos en una oferta de lo que es razonable, es muy probable que no encuentren a nadie que los cumpla todos y que contraten a alguien que no cumple todos los requisitos. - -En general, el mejor lugar para encontrar mejores vacantes son las pequeñas comunidades: - -* Los "meetups" son pequeños eventos generalmente periódicos donde se hacen pequeñas ponencias. -* Muchas ciudades tienen sus comunidades de programadores que suelen tener un canal online (que no suelen excluir a gente de fuera) -* Así mismo, muchas tecnologías también tienen sus propias comunidades, aunque haya menos específicamente españolas. - (En general, conseguir trabajos en el extranjero es bastante más complicado, así que recomiendo centrarse en comunidades españolas.) - -Muchas de estas comunidades tienen tablones de anuncios de ofertas de empleo. -Muchas de estas ofertas las ponen los miembros de la comunidad y no las empresas, con lo que es más probable que sean reales, y en muchos casos, podremos hablar con la persona que pone el anuncio directamente. -Además, en muchos casos los tablones de anuncios de comunidades tienen reglas más estrictas sobre publicación de rangos salariales y claridad en condiciones (como por ejemplo, la modalidad real de remoto). - -El volumen es por supuesto muy inferior, pero merece mucho la pena encontrar cuantos más tablones de anuncios de este tipo y centrarse más en sus ofertas. -(Aunque raramente tendremos suficiente con estas ofertas para encontrar empleo, con lo que siempre deberemos tirar de las grandes plataformas.) - -### El currículum y la presencia online - -Este documento no tratará el currículum, pues ya hay mucho material sobre el tema y tampoco es mi especialidad. - -Sin embargo, es necesario apuntar que un montón de gente con la que competiremos por puestos tendrá más material del que esperamos en su currículum; si no son trabajos serán proyectos personales, participación en proyectos open source o similares, etc. - -Por tanto, lamentablemente dedicar nuestro tiempo libre a aumentar nuestro currículum sea necesario. - -### Los procesos de selección - -Los procesos de selección intentan encontrar la mejor opción dentro de los candidatos. - -Un proceso de selección jamás puede evaluar adecuadamente la capacidad de un candidato de hacer su trabajo, con lo que siempre se basan en aproximaciones, en general no muy buenas, que en general poco tienen que ver con el trabajo que se hará realmente. - -Al no ser algo muy exacto, los procesos de selección tienen muchísimo de imitación y modas. - -Esto tiene una ventaja; en cualquier momento determinado de tiempo hay como media docena de tipos de entrevista. -Además, para cada tipo de prueba de moda, hay bastantes materiales para preparar la prueba. - -En mi opinión, hay que aceptar los sinsentidos de los procesos de selección, dedicar una cantidad significativa de tiempo para prepararnos los pocos formatos más populares en ese momento, y quizá consolarnos con que el número de formatos populares no es mucho mayor. - -### Escogiendo empleos - -[La falacia de McNamara dice que a la hora de tomar decisiones damos más importancia a lo que es fácil de medir con un número que a lo que no](https://es.wikipedia.org/wiki/Falacia_de_McNamara). - -El sueldo es de las pocas variables que podemos conocer cuando tenemos una oferta en la mesa, pero no es tan mala métrica. - -Pero mi recomendación es que aparte del sueldo, intentemos evaluar "¿cuánto me ayudará este trabajo a que mi próximo trabajo sea mejor?" - -Tampoco es que haya muchas maneras de evaluar esto con un mínimo de certeza, pero: - -* Plantearse lo que observamos en función de esta métrica nos puede ayudar a enfocar la evaluación. -* A posteriori, podemos evaluar cuánto nos ha ayudado un empleo en esta métrica e intentar extrapolar señales que nos ayuden a predecirla. diff --git a/programming/on-llms.md b/programming/on-llms.md deleted file mode 100644 index 744c2a38..00000000 --- a/programming/on-llms.md +++ /dev/null @@ -1,123 +0,0 @@ -# On LLMs - -This document is mostly an attempt to sort my thoughts on LLMs. -My intention is to flesh it out over time. - -I recently saw a survey about LLMs that had four options: they are great, they are hype, they are harmful, they are fake. -My thought was: it's all of the above. - -## LLMs are awesome - -I am absolutely amazed at LLMs. -By crunching an absurd amount of data, you get a magical box that can continue any dialogue, including dialogue that performs translations, writes code, and many others. -Although I suspect it's less technologically impressive, you can ask an LLM to draw a picture of anything and the LLM produces it! - -The components of LLMs are also magical. -When machine learning became a trend, many talented people found ways to turn things into vectors, which is amazing. -With LLMs, this trend continues. -Word embeddings can represent language as vectors, and we can operate with those vectors with semantic results. -This is an absolute breakthrough, with many profound implications. - -### LLMs seem great for accessibility - -Although I will elaborate later on my skepticism about some LLM results, I think that there are already valid applications of LLMs. - -Mostly, in accessibility. -LLMs are effective describing images and videos, can do text to speech and speech to text, and others. -Accessibility is a very important field and even if I have big objections to LLM use, I think LLMs are likely a net positive for accessibility. -Any criticism of LLM must take into consideration such applications, and if we want to avoid LLM use in these fields, we *must* provide equivalent alternatives. -(Some of my criticism below is ethical about LLM companies using content against their authors' wishes, but I think few authors would really object on not-for-profit accessibility usage of LLMs.) - -## LLMs are harmful - -### LLMs seem to require blatant disregard of intellectual property to be viable - -Napster (1999) was not the first high-profile case of massive copyright infringement, but since then, we have frequently seen large companies try to punish individuals to protect their intellectual property. - -With LLMs, courts have found out that companies training LLMs have used pirated material in a massive scale. - -I can see how for most people this is hugely unfair. - -Intellectual property is a complex topic, but morally it does not seem defensible to me that someone can be in trouble for pirating a TV show while a large company can try to make a lot of money through pirating massively. - -(As I mentioned, if this is not done for profit, to me this whole problem disappears.) - -I have mixed feelings with regard to copyright law, but ultimately I think that authors deserve to have rights over the use of their work. -For example, I think it is fair that a song composer might forbid an organization such as a political party from using their songs. -Art ultimately benefits us, and I think some degree of copyright protection helps incentivize creators. - -Therefore, given that LLM companies have widely admitted to using copyrighted material without permission, for-profit LLM companies must demonstrate that they have the right to use materials for training, and authors should have an easy way to prevent their works from being used, and to receive fair compensation if they want. - -My position does not hinge on how transformative the LLM output is. -It hinges on what authors want from their work; I am pretty certain that most authors nowadays are OK with other authors taking inspiration from their work (because they were also certainly inspired by other authors), but they are not OK with a large company reselling their work without compensation, or being used to replace authors with a machine. - -For other areas, such as writing programs using LLMs, I have similar objections. - -LLMs are effective as long as they have sufficient training material. -In my opinion, if you think the LLM would not be effective if all content from authors who would object to your specific use of an LLM was removed, then you should not use the LLM for that specific purpose. - -If you are using an LLM to create an illustration instead of commissioning an illustration, I think you should not do it because I don't think the LLM would be effective if all authors who objected to this use could remove their work from the training set. - -(As mentioned, I think very few authors would object to an LLM describing images to a blind person without making an obscene profit.) - -#### LLMs might need to be forced to publish all their output - -I read somewhere that LLM output should not be copyrightable. - -Making all LLM output public would solve a few issues: - -* Generated LLM content is now easy to identify by searching into the generated LLM output. -* IP laundering would result in public material that can be easily replicated and would make commercial benefit difficult. - -## LLMs are hype - -### LLMs might not be effective for writing code - -Hillel Wayne's [What we know we don't know](https://www.hillelwayne.com/talks/ese/ddd/) says that there are very few programming practices that are provably effective in increasing programming productivity. - -Apparently, the most effective practice is code review, and its effectiveness is small compared to getting proper sleep, not being stressed, and working the right amount of hours. - -Many other practices, such as good variable naming, or static or dynamic typing, do not seem to have a big effect, although most of us believe that some specific practices make us much more productive. - -Many people claim that LLMs make them wildly more productive, and I even believe many of them truly believe so. - -However, whenever I pull the thread on such claims, I tend to find things that make me skeptic. - -I do not rule out that LLMs may increase productivity *today* for *some* tasks. -But I suspect they might *also* decrease productivity sometimes. -This lack of certainty of their effectiveness for coding *combined* with the rest of the problems I see with the use of LLMs drive me towards rejecting the use of LLMs for code. -They might help, but they might also hinder us, and I think it's better to be conservative with regard to their use. - -### LLMs cannot be good oracles - -My understanding of LLMs is that they are good at producing "things that look like X". - -If you ask them to create a picture of X, then they will produce something that looks like a picture of X. -This is likely what you want, so I think LLMs can be reasonably effective at this task. - -If you ask them to answer X, then they will produce some text that looks like the answer to X. -However, something that reads like the answer to a question is likely *not* an answer to the question. - -Although it is surprising that many times the LLM will produce a correct answer, I think ultimately LLMs are not a good way to answer questions because they do that accidentally. - -## LLMs are harmful - -Or rather, they point to serious problems. - -I see many programmers using LLMs to review their code or brainstorm. - -Although LLMs might be somewhat effective in those tasks, I am surprised because people do not rely on other people for this. - -To me, this points to people not having other people to collaborate with, or preferring to collaborate with a piece of software rather than someone else. - -Although this might not be unequivocally harmful, it worries me, and I think we should spend time and resources looking at this phenomenon. - -Personally, I think we should offer ourselves to others more, probably in a tit-for-tat fashion. - -## Other sources - -* [The Future of Software Development is Software Developers](https://codemanship.wordpress.com/2025/11/25/the-future-of-software-development-is-software-developers/) - > “But this time it’s different, Jason!” - > [...] - > And there’s another important distinction: in previous cycles, the technology worked reliably. - > We really could produce working software faster with VB or with Microsoft Access. diff --git a/programming/prolog_vs_sql.md b/programming/prolog_vs_sql.md deleted file mode 100644 index beb92c01..00000000 --- a/programming/prolog_vs_sql.md +++ /dev/null @@ -1,120 +0,0 @@ -# Showing the similarities between SQL and Prolog - -SQL is a very common programming language, which sometimes is compared to the relatively more obscure Prolog language. -Both are examples of declarative languages, where you define some facts, then you can ask questions about those facts, and the system answers the questions without you writing an explicit program. - -However, I could not find a good example of their similarities. -This text presents the most typical Prolog example, and translates it to SQL. - -## A typical Prolog example - -`[x]` reads Prolog facts from file `x`. -We can use the special file `user` to read facts from the REPL, ending the facts with ctrl+d: - - -``` -$ swipl -?- [user]. -|: father(jim, julian). -|: father(julian, joe). -|: father(julian, jerome). -|: father(pete, perry). -|: ^D -true. -``` - -You should read `father(X,Y)` as "`X` is the father of `Y`". -So Jim is the father of Julian, and so on. - -We can ask Prolog questions: - -``` -?- father(julian, jim). -false. -``` - -Is Julian the father of Jim? There is no known fact about this, so no. -But Julian *is* the father of Joe: - -``` -?- father(julian, joe). -true. -``` - -More interestingly, you can ask who are Julian's children: - -``` -?- father(julian, X). -X = joe ; -X = jerome. -``` - -(You press `;` to get further answers.) - -## A simple translation to SQL - -You can do pretty much the same with SQL, first define the facts as values in tables: - -``` -$ sqlite3 -sqlite> create table fatherhood(father, son); -sqlite> insert into fatherhood values ('jim', 'julian'); -sqlite> insert into fatherhood values ('julian', 'joe'); -sqlite> insert into fatherhood values ('julian', 'jerome'); -sqlite> insert into fatherhood values ('pete', 'perry'); -``` - -Then you can get the same answers: - -``` -sqlite> select * from fatherhood where father = 'julian' and son = 'jim'; -sqlite> select * from fatherhood where father = 'julian' and son = 'joe'; -julian|joe -sqlite> select * from fatherhood where father = 'julian'; -julian|joe -julian|jerome -``` - -## The next step in Prolog - -The typical example continues with some logic: - -``` -?- [user]. -|: grandfather(X,Y) :- father(X, Z), father(Z, Y). -|: ^D -true. -``` - -`X` is the grandfather of `Y` if `X` is the father of `Z`, and `Z` is the father of `Y`. -Then you can ask questions, and Prolog knows the answers: - -``` -?- grandfather(jim, X). -X = joe ; -X = jerome. - -?- grandfather(X, jerome). -X = jim ; -false. -``` - -## Can we do the same in SQL? - -You might not guess the answer on the first try, but the answer is not complex: you can do the same thing with SQL views: - -``` -sqlite> create view grandfatherhood as - ...> select fatherhood_1.father as grandfather, fatherhood_2.son as nephew - ...> from fatherhood as fatherhood_1 join fatherhood as fatherhood_2 on (fatherhood_1.son = fatherhood_2.father); -``` - -And if you ask the same questions, SQLite gives the same answers: - -``` -sqlite> select * from grandfatherhood where grandfather = 'jim'; -jim|jerome -jim|joe -sqlite> select * from grandfatherhood where nephew = 'jerome'; -jim|jerome -``` diff --git a/programming/python/about_django.md b/programming/python/about_django.md deleted file mode 100644 index 37004abc..00000000 --- a/programming/python/about_django.md +++ /dev/null @@ -1,141 +0,0 @@ -# About Django - -Without more context, one of the technologies I recommend to everyone is Django. - -Django is a Python web framework with "batteries included". - -Web frameworks can provide more or less tools to write applications. -Typically, frameworks that provide fewer tools are more flexible and give developers more freedom to develop their applications in the best possible way. -Similarly, frameworks that provide more tools tend to guide you towards a specific way to writing applications, and typically, require more work if you want to deviate. - -In my opinion, many applications you might need to develop are very similar and have similar issues, and solving them ad-hoc for each project is a waste. -Therefore, I lean towards using frameworks that provide more batteries in most cases. - -(Certainly, there are projects that clearly need special approaches, or which deviate enough from any generic web framework.) - -In fact, most of the complaints described in this document are caused by Django having too few batteries, not too many! - -> [!TIP] -> [Django training wheels](https://github.com/alexpdp7/django-tws) is my project in alpha to address some of those shortcomings. - -## The Django admin - -Besides including more batteries than most other frameworks, and being in general a well-engineered framework in my opinion, Django includes the admin. - -The admin is a declarative way to build administrative sites where some users edit data stored in the application database. - -Many similar tools exist, but I have not found any other tool that can do so much. - -* The Django admin handles multi-table relationships very well, including picking foreign key targets and editing related table data. - For example, if a person entity has a "parent" related foreign key relationship, the Django admin provides a search functionality to pick a person's parent. - If the person entity has a list of children, the Django admin provides a way to add and edit children from the person form. - -* The Django admin has a simple, but useful for many scenarios permissions functionality, where editing certain entities is restricted to groups of users. - -The Django admin is frequently a big boost during the early development of database-backed applications, and sometimes it can provide value during a big part of the life of an application. - -Additionally, traditionally when working with frameworks without an equivalent facility, the friction of adding an interface to edit a piece of data can be large. -Developers pressed for time might opt to hardcode the data in the source code of the application, requiring code changes to modify certain behaviors of the application. -When the friction to add a user interface to edit such data is low, developers can configure the admin to let those users edit the data directly without going through the developers. - -## Django problems - -However, there are still many common issues for which batteries could exist, but that Django does not provide. - -### Django has no support or documentation about packaging Django projects - -Most Django projects have dependencies besides Django. -In order to develop and deploy Django applications, you likely must install other dependencies. -Django does not include documentation nor support to do this. - -Many different approaches and tools exist to manage Python project dependencies. -Understandably, endorsing one particular approach in Django could be controversial. -So Django leaves the choice of approach up to users. -Additionally, Django adds a few difficulties in Python project management, and users must figure out how to handle Django projects in their chosen approach. - -Several initiatives have tried to tackle this problem, notably: - -* https://github.com/radiac/nanodjango - -### Django settings are a partial solution - -Django provides settings to manage the configuration for a Django project. -You implement Django settings by writing a Python module. - -For example, the default Django template includes the following snippet to configure the database connection: - -``` -DATABASES = { - 'default': { - 'ENGINE': 'django.db.backends.sqlite3', - 'NAME': BASE_DIR / 'db.sqlite3', - } -} -``` - -Besides assigning a setting directly like in the preceding snippet, you can use Python code to assign settings. - -This allows you to tackle many common issues, such as setting up a different database connection for development and production, while keeping the production database credentials away from the source code repository. -There are many similar issues that you must tackle in nearly all projects. - -Several initiatives tackle some of those issues: - -* https://github.com/jazzband/dj-database-url provides a way to configure the database connection through an environment variable. - -### Django does not explain a development database workflow - -Django provides migrations to handle schema changes. -Migrations work well and are a valid solution to handle schema changes in production. - -However, while developing a Django application, you frequently need to make many temporary changes to the data definition until you find the right data definition. - -In my opinion, if you follow the Django documentation, then you might end up using migrations for those development schema changes. -This is awkward and problematic, and there are procedures to develop database changes that work better. - -I would like a command that recreates your database, applying unmigrated model changes. -This command could also have hooks to load sample data. -(Likely, Python code and not fixtures.) - -### Django only tackles database-based, server-side-rendered, non highly interactive web applications - -While certainly a huge amount of applications: - -* Revolve around data stored in a relational database -* Are better implemented as server-side rendering applications -* Do not require very complex or real-time interactions - -There are certainly many applications that do not fit this mold. - -In my opinion, focusing on database-based applications is a good decision. -Many Django features (like the admin) revolve around the database, and a framework oriented to other applications likely should be very different. - -However, more and more applications break the limits of server-side rendering, and while you can build such applications with Django, you need a lot of effort or finding additional libraries to use. - -For example: - -* [Django REST framework](https://www.django-rest-framework.org/) provides a layer to provide REST APIs on top of the Django ORM. -* Projects exist to add support for Django for front end frameworks such as [htmx](https://htmx.org/) or [Hotwire](https://hotwired.dev/). - These frameworks are an intermediate step between traditional server-side-rendered applications and JavaScript front ends, enabling most of the benefits of JavaScript front ends within the traditional server-side rendering approach. - -Additionally, providing an API is also useful beyond JavaScript front ends. -APIs are necessary for other purposes, such as implementing mobile apps to interact with your application, or just providing an API for programmatic access to your application. - -### Some common tasks should have more tutorial content - -The Django documentation is mostly for reference, covering all Django features, but with little content on how to use Django. -The items I list below likely are documented on books, websites, forums, etc. -If you know a good source for many of those, even if it is paid, feel free to let me know to add references. - -* Admin - * Restricting users to a subset of the instances of a model. - For example, users belong to organizations and users should only see instances of some model related to their organization. - The FAQ contains [How do I limit admin access so that objects can only be edited by the users who created them?](https://docs.djangoproject.com/en/5.1/faq/admin/#how-do-i-limit-admin-access-so-that-objects-can-only-be-edited-by-the-users-who-created-them), which is a very similar question and points to the features you need to use to achieve these goals. - (These requirements are often related to requiring [extending the existing User model](https://docs.djangoproject.com/en/5.1/topics/auth/customizing/#extending-the-existing-user-model).) - * Having a search UI for reference fields instead of dropdowns. - Many projects similar to the admin only offer dropdowns for reference fields. - This does not work when the referenced objects are more than a couple. - Django calls this [`raw_id_fields`](https://docs.djangoproject.com/en/5.1/ref/contrib/admin/#django.contrib.admin.ModelAdmin.raw_id_fields), and it is difficult to learn that this feature exists. - -## Further reading - -* [CRUD is an important unsolved problem](../crud_is_an_important_unsolved_problem.md) diff --git a/programming/python/creating_nice_python_cli_tools.md b/programming/python/creating_nice_python_cli_tools.md deleted file mode 100644 index b192da1a..00000000 --- a/programming/python/creating_nice_python_cli_tools.md +++ /dev/null @@ -1,40 +0,0 @@ -Following this advice can make your tools easy to install by others, pleasant to use, robust, cross-platform, and powerful. - -* Use [my suggestions for setting up Python projects](project_setup.md), particularly: - * Provide instructions for installing your tool using [pipx](https://github.com/pypa/pipx). - Using pipx, people can install and upgrade your script using a simple command that requires no administrative privileges (but it requires having Python and pipx installed). - * As you are using [uv](https://docs.astral.sh/uv/), following the indications above: - * Use [entry points](https://docs.astral.sh/uv/concepts/projects/config/#entry-points), so when installing your tool via pipx or other means, your scripts are added to the user's path. - * Dependencies you define will be installed automatically along with your application. - This reduces the effort users need to use your application if you need third-party libraries. - However, I would still advise to avoid unnecessary dependencies (for simple HTTP requests you can use the base library. If you do complex requests, then using a third-party library might be much simpler). - As you are using pipx, those dependencies will be installed to a isolated virtualenv, so they will not interfere with anything on your system. - * As your application is properly packaged, you can split your code into different Python files and use imports without issues. -* If your application requires secrets, such as credentials or others, consider using: - * The standard [getpass](https://docs.python.org/3/library/getpass.html) module. - This prompts for a string on the command line, hiding what the user types. - * The [keyring](https://pypi.org/project/keyring/) library. - This stores secrets using your operating system facilities. -* Use the [appdirs](https://pypi.org/project/appdirs/) library to obtain "user paths", such as the users directory for configuration, cache, or data. - appdirs knows the proper paths for Linux, macOS and Windows. - So for example, if your tool caches files and uses appdirs to find the cache directory, you might gain benefits such as cache files being excluded from backups. -* If your tool requires significant time to complete a process: - * Use the [tqdm](https://tqdm.github.io/) library to add a progress bar. - * But also consider using the standard [concurrent.futures](https://docs.python.org/3/library/concurrent.futures.html) module to add parallelism if you can. - The [map](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor.map) function is particularly easy to use. - Use it with a [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) if the parallel tasks are IO-bound or invoke other programs, or with [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor) if they perform significant CPU work in Python (to avoid the [GIL](https://wiki.python.org/moin/GlobalInterpreterLock)). - * Consider using the standard [logging](https://docs.python.org/3/library/logging.html) module with a format that uses a timestamp, so users can inspect how much time is spent in different parts of the program. - You can also use logging module to implement flags such as `--debug` and `--verbose`. -* Although fancier tools exist, the standard [argparse](https://docs.python.org/3/library/argparse.html) module is good enough for most argument parsing. - It has decent support for [sub-commands](https://docs.python.org/3/library/argparse.html#sub-commands), and the linked document describes a very nice pattern to define functions for sub-commands, under "One particularly effective way of handling sub-commands..." - Provide help text for non-obvious parameters. - argparse supports a lot of different argument types with a lot of functionality out of the box, such as enumerated options, integers, and file names. - The main reason for using a fancier argument parsing is that argparse does not have autocomplete support, but you can add [argcomplete](https://github.com/kislyuk/argcomplete) to an argparse program with minimal modifications to retrofit autocomplete. -* Remember that the standard [json](https://docs.python.org/3/library/json.html) module is built-in. - You can use it to add a mode to your tool that generates JSON output instead of human-readable output, for easy automation of your tool, maybe using [jq](https://stedolan.github.io/jq/) or [fx](https://github.com/antonmedv/fx). -* Use the standard [subprocess](https://docs.python.org/3/library/subprocess.html) module to execute other commands. - * Remember never to use `shell=True`, so among other things, your tool will work correctly with files using spaces in their names. - * Use `check=True` so if the subprocess fails, an exception will be raised. - This is likely the best default behavior, although the error is a bit ugly, this normally prevents ugly problems and it's a safe option. - -You can find examples for many of those techniques in my [repos](https://github.com/alexpdp7?tab=repositories&q=&type=&language=python&sort=). diff --git a/programming/python/dependency_handling.md b/programming/python/dependency_handling.md deleted file mode 100644 index 8acf0cee..00000000 --- a/programming/python/dependency_handling.md +++ /dev/null @@ -1,111 +0,0 @@ -# Some brief notes about Python dependency management - -This article is mostly written for people who have already used Setuptools and have faced issues derived from its "limitations". -Specifically, if you have seen files named `requirements.txt` and have wondered how they work, what problem do they solve, and if they are something you should investigate, I hope you find this article interesting. - -If you are starting to write Python software and you are looking at an introductory text about distributing your software and using dependencies, I would recommend you to skip directly to using the "new generation" Python packaging tools. -This way, you can avoid most of the complexities in this post. -You can also check out the [Python Packaging User Guide](https://packaging.python.org/en/latest/) and [my own prescriptive project setup recommendations](project_setup.md). - -Most programs can use third-party libraries to implement parts of their functionality without implementing everything from scratch. - -pip is the recommended package installer for Python. -Python installers include pip, although pip is a component that can be installed separately from Python. -Some Linux distributions separate pip from the main Python package (for example, Debian has a `python3` package and a `python3-pip` package), but a Python install without `pip` is not really fully functional for many purposes. - -pip fetches Python packages from diverse sources and adds them to a Python installation. -Python packages can specify other packages as dependencies, so when pip installs a package, it also installs the required dependency chain. - -The traditional mechanism for packages to specify dependencies is Setuptools and other closely related projects. - -## About Setuptools - -Setuptools is a build and distribution system based on the distutils module that was part of the base Python library. - -Package metadata in Setuptools can be defined in many different ways, such as a `setup.py` file, a `setup.cfg` file, or a `pyproject.toml` file. -In these files, you list the dependencies for your package, specifying the name of the package and constraints. - -Constraints define which version of a dependency you want to use. -The constraint does not be an exact version, it can also be a range of versions, or a constraint such as "lower than version n". - -(Constraints additionally can specify other restrictions, such as requiring different versions for different Python versions, and other interesting possibilities.) - -In my opinion, although you can package applications and libraries properly using Setuptools, doing it correctly requires much knowledge, effort, and is error-prone. - -## Version locking and `requirements.txt` - -There is a dependency-management approach that can be very effective in many cases. - -This approach involves differentiating between "applications" and "libraries". - -Libraries are Python packages meant to be used as a dependency by other Python code. -Applications are Python code that may use other libraries as dependencies, but which no other Python code depends on. - -### Specifying dependencies for libraries - -Libraries specify coarse but safe dependency requirements. - -Suppose that we are developing the foo library. -The foo library depends on the bar library. -The bar library uses a versioning scheme similar to semantic versioning. -When we develop the foo library, we use version 1.2.3 of the bar library. - -Then, we specify that the foo library depends on the bar library, with a version constraint like `>=1.2.3, <1.3`. -This version constraint lets the library to be used with the 1.2.4 version, which is likely compatible with the code in the foo library, and even introduce valuable bug fixes. -However, the 1.3.0 version of the bar library would not be a valid dependency. -This is probably a good idea; the 1.3.0 may contain changes that the foo code is incompatible with. -(When we later create new versions of the foo library, we may want to consider depending on newer versions of the bar library, and possibly update the code so it continues working correctly.) - -This helps reduce conflicts. -As libraries specify coarse dependencies, the chances of two libraries having incompatible requirements is lower. -However, specifying coarse dependencies probably requires more testing to ensure that if different dependency versions are installed, the library works correctly. - -### Specifying dependencies for applications - -Applications specify exact dependency requirements. - -While libraries are not usually run on their own, applications are executed directly by end users. -If a library does not work well, then you can temporarily go back to an older version or apply other fixes. -But if an application does not work correctly, you have worse problems. - -If you specify exact dependency versions for an application, users of the application will always use a single combination of dependencies, which makes making things robust easy. - -A popular approach is for applications to specify Setuptools requirements with coarse versioning (just like libraries do), but to provide a list of the specific versions used for development and deployment. -To create this list of dependencies, you can install your application using pip or some other mechanism, then extract a list of the dependency versions that were installed and store it in a file. -For example, you can do this by executing: - -``` -$ pip install . # executed from the root of the application source code -$ pip freeze >requirements.txt -``` - -Later on, if you install the application using the following command: - -``` -$ pip install -r requirements.txt -``` - -Then you will always install the same set of dependencies, preventing issues by updated dependencies. - -Note: pip and other package installers do *not* use `requirements.txt` or any other similar file outside the `setup.cfg` file and the other files defined in Setuptools. -If you do not install your application explicitly using `pip install -r requirements.txt`, you will probably install a different set of dependencies. - -## Beyond version locking - -Following the approach above can be enough to use dependencies correctly. - -However, maintaining the Setuptools version dependencies and `requirements.txt` is straightforward, but tedious. -Also, this approach of dependency management is not obvious, and may not be easy to get right completely. - -For these reasons, several projects have appeared that implement approaches similar to the one described above, but more automatic and prescriptive. -These projects often manage automatically a file equivalent to `requirements.txt`, while the developer only specifies coarse dependencies for applications. - -Some of these tools are listed by [a page about relevant projects about packaging](https://packaging.python.org/en/latest/key_projects/) maintained by the [Python Packaging Authority](https://www.pypa.io/). -Look for tools about managing dependencies and packaging. - -Thanks to some improvements in the Python ecosystem, pip can nowadays install dependencies using many different packaging tools correctly. - -These projects can also offer some other improvements, so I would encourage Python developers to investigate them and try them out. - -However, also note that following a correct approach, Setuptools and manual version locking are perfectly valid ways to manage Python code dependencies. -Also, there are projects such as [pip-tools](https://github.com/jazzband/pip-tools) that complement Setuptools, addressing many of the issues described here, without requiring entirely new packaging tools. diff --git a/programming/python/project_setup.md b/programming/python/project_setup.md deleted file mode 100644 index a5f0c789..00000000 --- a/programming/python/project_setup.md +++ /dev/null @@ -1,114 +0,0 @@ -There is a significant amount of Python project tooling. This document collects my personal recommendations on how to set up a Python project. - -It is not meant to reflect the best or most common practices, just my personal taste. - -# Use pipx - -Pipx is a tool that installs Python packages to your user environment. It creates an isolated environment for every tool, so if you install multiple packages they won't have version conflicts. It also takes care of adding a module's entrypoints to your user path. - -uv can do very much the same and additionally uv can install most Python versions. -However, at the time of writing this, pipx is available as a package in many Linux distributions, while uv is not. - -If your project can be packaged so that it works with pipx, then many Linux users will be able to install it with pipx after installing pipx with their package manager. - -uv can be more convenient for software that requires specific versions of Python that are not available in Linux distributions, but in general cannot be installed with Linux package managers. - -# Use uv - -When using third-party dependencies in your Python code, it is highly interesting to avoid installing any project-specific dependency outside the project. - -To achieve that, traditionally virtualenvs are used; those are miniature Python installations where you can install any library you want. Virtualenvs need to be explicitly activated to be used, so it is easy to have a virtualenv for each Python project you are working on. - -uv is a tool that leverages virtualenvs to manage a project's dependencies, managing virtualenvs automatically. -uv can also manage Python distributions, downloading automatically Python versions other than the existing ones on your system. - -There are many similar tools such as pipenv and there are many multiple ways to specify a project's dependencies (`setup.py`, `requirements.txt`, etc.); uv provides a convenient way to do everything. - -Consider reading [some brief notes about Python dependency management](dependency_handling.md). - -# Test your code - -Write the necessary amount of tests so you can make changes to your code with confidence. - -If you find yourself iterating over a piece of code slowly, try to isolate the code you are writing so it can be tested in isolation for faster iteration. - -## Use pytest for testing - -Python provides *two* testing frameworks in its standard library, but they have some limitations: - -* `unittest` is an xUnit-style testing framework which follows non-PEP-8 naming conventions (probably because it copied the Java's jUnit), so extra work needs to be done to make your test cases PEP-8 compliant -* `doctest` is a tool which allows you to run tests embedded in comments. For some code, it is great and helps you provide good, up-to-date documentation. However, a significant amount of code is awkward to test using `doctest`. - -Use `doctest` whenever you can, but outside that, use `pytest` to write PEP-8-compliant tests. - -Ensure that your test suite runs correctly by running `pytest` without any arguments. - -Use plain Python's `assert` statements to check assertions in your tests; `pytest` does some magic to provide nice error messages on failed assertions. - -## Gate your changes with testing - -Set up your version control so changes cannot be made to your main codeline without passing continuous integration tests (and possibly, code review). - -# Perform automated code formatting and static checking - -> [!NOTE] -> I have been using [ruff](https://github.com/astral-sh/ruff) recently. -> Not enough to recommend it unconditionally over flake8/black, but I am liking it so far. -> Consider testing it. -> It requires slightly less configuration and it comes with more lints. - -## Use Ruff - -Use Ruff to format and lint your code. - -# Version control - -## Use a minimal gitignore file - -See [use gitignore properly](../git/git_advice.md#use-gitignore-properly). - -## Keep your code together - -All the code you modify as part of the project should be kept in a single repository so you can make atomic changes. If you find yourself making changes across multiple repositories and having to coordinate them, consider merging those repositories. - -Use git submodules or similar mechanisms to refer to code you modify that must be kept external. - -Use [Josh](../git/combining_repos_with_josh_filter.md) to publish parts of the repository outside the main repository if needed. - -# Support multiple modern versions of Python - -Unless you have a specific requirement to support Python 2, don't. - -It is reasonable to support multiple versions of Python 3 from 3.4 onwards. Supporting the oldest versions might limit the features you can use (although features from more modern versions have been backported), so evaluate which operating systems and versions you need to support and try to support Python versions readily available for them (in Linux, by using mainline distro repos, for instance). - -Even if you are not running your code using the latest versions of Python, try to support all the newest available versions. - -Use continuous integration to run your tests in all supported versions of Python. - -# Use ipython and ipdb - -Add ipython and ipdb as development dependencies. - -# Versioning - -Unless you have a specific requirement to support multiple versions of your code or to distribute to a platform that *requires* versioning (such as pypi), do not explicitly version your code but allow implicit versioning (e.g. it should be possible to identify which Git commit deployed code comes from). - -# Documentation - -Provide a `README` containing: - -* The purpose of the code -* How to use the code -* How to develop the code - -If the `README` becomes unwieldly, separate usage instructions to `USAGE` and/or development instructions to `HACKING`. - -Provide docstrings detailing the external interface of Python modules. Provide internal comments in modules detailing implementation. - -If you are developing a library/framework, consider using Sphinx. Sphinx can create a documentation website for a Python project, taking advantage of docstrings. - -# Distribution - -If your code can be executed from a command line, consider documenting installation via `pipx`. - -If your code has dependencies that are not trivial to install (such as Pandas), consider publishing a Docker image or using dependencies that are simpler to install. Design your Docker images so rebuilding the image on most changes is fast. diff --git a/programming/python/python_modules_primer.md b/programming/python/python_modules_primer.md deleted file mode 100644 index d80b96d3..00000000 --- a/programming/python/python_modules_primer.md +++ /dev/null @@ -1,272 +0,0 @@ -# Python Modules Primer - -## Prerequisites - -These instructions assume a Linux environment. -A macOS environment is similar, but not identical. -A Windows environment is more different. - -## Previous knowledge - -### A refresher on the `PATH` variable - -If you execute the following command in your terminal: - -``` -$ echo hello -``` - -, the shell searches for the `echo` command in the directories listed in your `PATH` environment variable. -You can display your `PATH` variable by running: - -``` -$ echo $PATH -/home/user/.local/bin:/home/user/bin:/usr/share/Modules/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin -``` - -The contents of the `PATH` variable depend on your particular environment. - -If you run the following command: - -``` -$ which echo -/usr/bin/echo -``` - -The `which` command prints where the shell locates the `echo` command. - -### A refresher on shell scripts - -If you create a file named `foo.sh` with the following contents: - -``` -#!/bin/sh - -echo hello -``` - -You define a "shell script". -The first line indicates that this shell script is executed by using the `/bin/sh` command. -The rest of the file are commands to be executed by the shell command. -These commands behave as if you typed them into your terminal, so if you execute this script, the command `echo hello` will be executed, printing `hello`. - -If you try to run `foo.sh` like you run the `echo` command, by typing its name, it does not work: - -``` -$ foo.sh -bash: foo.sh: command not found... -``` - -, because the shell looks for the `foo.sh` in the directories listed in the `PATH` variable. -Unless you created the `foo.sh` file in a directory like `/usr/bin`, the shell will not find the `foo.sh` command. - -A solution to this problem is to specify the path to the `foo.sh` file, instead of relying on the `PATH` variable. -However, if you do this, you face a second problem. - -``` -$ ./foo.sh -bash: ./foo.sh: Permission denied -``` - -This happens because only files with the executable permission can be executed in this way. -To solve this, add the executable permission; then it works: - -``` -$ chmod +x foo.sh -$ ./foo.sh -hello -``` - -## The `import` statement in Python - -### Importing from the Python standard library - -Run the following commands by using the Python REPL: - -``` -$ python3 -Python 3.9.17 (main, Aug 9 2023, 00:00:00) -[GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] on linux -Type "help", "copyright", "credits" or "license" for more information. ->>> import datetime ->>> datetime.datetime.now() -datetime.datetime(2023, 9, 11, 21, 53, 16, 331236) -``` - -`import` works in a similar way to running a command in the shell. -Python searches a number of directories looking for the `datetime` module. - -To see which directories are searched, run: - -``` -$ python3 ->>> import sys ->>> sys.path -['', '/usr/lib64/python39.zip', '/usr/lib64/python3.9', '/usr/lib64/python3.9/lib-dynload', '/home/alex/.local/lib/python3.9/site-packages', '/usr/lib64/python3.9/site-packages', '/usr/lib/python3.9/site-packages'] -``` - -`sys.path` is a list of the directories that the `import` command searches. -The contents of `sys.path` depend on your operating system and Python installation method. - -In my system, the `/usr/lib64/python3.9` directory contains the `datetime.py` module. - -``` -$ head /usr/lib64/python3.9/datetime.py -"""Concrete date/time and related types. - -See http://www.iana.org/time-zones/repository/tz-link.html for -time zone and DST data sources. -""" - -__all__ = ("date", "datetime", "time", "timedelta", "timezone", "tzinfo", - "MINYEAR", "MAXYEAR") -... -``` - -`/usr/lib64/python3.9` contains the modules in [the Python standard library](https://docs.python.org/3/library/). - -### Importing your Python files - -If you create a file with the `a.py` name: - -``` -def f(): - return 2 -``` - -, and another with the `b.py` name: - -``` -import a - -print(a.f()) -``` - -, then: - -``` -$ python b.py -2 -``` - - -This works, because `sys.path` contains `''`, which means "the current directory". - -(`sys.path` is very similar to the `PATH` variable. However, `sys.path` contains the current directory by default, whereas `PATH` does not.) - -When `import a` is executed, then Python searches the directories in `sys.path` for an `a.py` file; it is found when checking the `''` path. -When `import datetime` is executed, Python searches in the current directory (because `''` comes first in the path), doesn't find it, but then finds it in the following `/usr/lib64/python3.9` directory. -Python iterates over the `sys.path` directories, and loads the *first* matching file. - -## Installing libraries - -When writing Python software, sometimes it is enough with the modules included in the standard library. -However, frequently you want to use other libraries. -To use Python libraries, you must install them using the `pip` program. - -The `pip` program is not part of the `python3` package in some Linux distributions, and comes from the `python3-pip` package. - -The `pip` program can download libraries from https://pypi.org/ , the Python package index, and install them. -`pip` installs libraries to a "Python environment". - -Old versions of `pip` defaulted to installing libraries to the "system" Python environment. -In a Linux system, the system Python environment is located in a directory such as `/usr/lib64/python3.9`. -By default, normal Linux users cannot write to `/usr`, so installing a package would fail. - -Modern versions of `pip` detect that they cannot write to the "system" Python environment, and then redirect the install to the "user" Python environment. -The "user" Python environment is in a directory such as `~/.local/lib/python3.9`. - -You could use a command such as `sudo pip install` to grant `pip` the privileges required to write to `/usr`. -However, this can make a Linux system unusable. -Most Linux systems use software that uses the "system" Python environment. -Altering the "system" Python environment can break such software. -Do not run `sudo pip install` with root privileges unless you know why you need this. - -If you use a modern `pip` (or use the `--user` option), you can install libraries to the "user" Python environment. -However, this is problematic because a Python environment can only contain a single version of a Python library. -If you have two different Python programs that different versions of the same library, then these two programs cannot coexist in the "user" Python environment. - -In general, Python virtual environments are used to address this problem. - -## Creating Python virtual environments - -If you run: - -``` -$ python3 -m venv <some path> -``` - -This will create a directory with the path you specify, with the following contents: - -``` -<some path> -├── bin -│ ├── activate -│ ├── pip -│ ├── python -├── include -├── lib -│ └── python3.9 -``` - -The `python` and `pip` commands are copies of the same commands from the "system" Python environment. - -But these commands work differently from the "system" Python environment commands: - -``` -$ <some path>/bin/python ->>> import sys ->>> sys.path -['', '/usr/lib64/python39.zip', '/usr/lib64/python3.9', '/usr/lib64/python3.9/lib-dynload', '<some path>/lib64/python3.9/site-packages', '<some path>/lib/python3.9/site-packages'] -``` - -`sys.path` uses the `lib` directories in the virtual environment. - -When you use the `pip` program from the virtual environment, it installs the libraries to the virtual environment. - -You can create as many virtual environments as you need, and you can install different versions of libraries to each virtual environment. - -## Activating Python environments - -You can run the `python` and `pip` commands by specifying the full path, like we did when executing the `foo.sh` command earlier. - -By default, if you run `python`, the shell will invoke the `python` command from the "system" Python environment because it is in a directory included in the `PATH` variable. -If you specify the full path, you override this. - -To save typing, the `bin` directory of a virtual environment contains an `activate` file. -The `activate` file is a "special" shell script that must be invoked in one of the following two ways: - -``` -$ source <some path>/bin/activate -``` - -``` -$ . <some path>/bin/activate -``` - -`source` and `.` are synonyms. -They are special shell commands that are needed for the `activate` command to work correctly. - -`activate` alters your path, so that the `bin` directory in your virtual environment comes first in your path. - -``` -$ echo $PATH -/home/user/.local/bin:/home/user/bin:/usr/share/Modules/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin -$ . <some path>/bin/activate -(some path) $ echo $PATH -<some path>/bin:/home/user/.local/bin:/home/user/bin:/usr/share/Modules/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin -``` - -, and thus if you run `python`, `<some path>/bin/python` will be executed instead of `/usr/bin/python`. - -Besides changing your prompt to indicate the virtual environment is activated, `activate` only alters your `PATH`. -`activate` is not mandatory to use a virtual environment. -For example, when running the Python command, if you specify the path of the Python executable in a virtual environment, the command will execute as if the virtual environment had been activated. -Tools such as `poetry` have commands such as `poetry run` that can run commands inside a virtual environment without activating it. -Activation can save time, but it is also more error-prone than more explicit means of using virtual environments. - -## Further reading - -* [Some brief notes about Python dependency management](dependency_handling.md) continues this explanation, introducing the need for packaging tools. -* [Installing Python Modules](https://docs.python.org/3/installing/index.html), from the official Python documentation, describes the `pip` program in more depth. -* [`venv` - Creation of virtual environments](https://docs.python.org/3/library/venv.html), from the official Python documentation, describes virtual environments in more depth. diff --git a/programming/python/scraping_with_selenium_on_docker.md b/programming/python/scraping_with_selenium_on_docker.md deleted file mode 100644 index 61ba1c12..00000000 --- a/programming/python/scraping_with_selenium_on_docker.md +++ /dev/null @@ -1,7 +0,0 @@ -Don't use Selenium, use [Playwright](https://playwright.dev/python/): - -* Playwright automatically sets up headless browsers. -* Provides convenient abstractions for locating elements in a page (mostly no XPath required. It can match "intelligently" using text). -* Has a handy UI tool that records your actions in a browser and writes equivalent *readable* Playwright code. - -Further reading: https://new.pythonforengineers.com/blog/web-automation-dont-use-selenium-use-playwright/ diff --git a/programming/so_you_want_to_play_with_functional_programming.md b/programming/so_you_want_to_play_with_functional_programming.md deleted file mode 100644 index ecf2cce3..00000000 --- a/programming/so_you_want_to_play_with_functional_programming.md +++ /dev/null @@ -1,249 +0,0 @@ -# So you want to play with functional programming - -If you are a programmer working on popular languages such as Python or Java, you are likely to have read articles about "functional programming". -These articles can give you the idea that learning functional programming improves your skills as a programmer. -I share this opinion. - -This article tries to help people who have read about functional programming figure out how to proceed. - -Note that this article expresses personal opinion. -Particularly, I am not an expert in this topic: - -* I have programmed some Haskell (about 50 Project Euler problems, plus experimentation on and off during the years). -* I have studies of SML and functional programming. -* I have some minimal experience with Lisp. -* I have applied some functional programming techniques while being paid to write in non-functional programming languages. -* However, I have never been paid to write in any of those languages. - -Shortly after writing this, I was shown: - -https://technomancy.us/194 - -I agree with most of that the article explains. -I might extend this article with some similar ideas, but for the moment, I recommend reading that carefully before reading the rest of this article. - -## The basics of functional programming - -[The Wikipedia article on functional programming](https://en.wikipedia.org/wiki/Functional_programming) is a great place to get started. - -The article describes a few concepts related to functional programming. -I consider the following two the pillars of functional programming: - -* First-class and higher order functions. - In languages with first-class functions, functions are values that you can use like other types such as integers. - Higher-order functions are functions that take functions as arguments or return functions. - -* Pure functions. - Pure functions always return the same value for a given set of arguments. - Pure functions also have no side effects; they do not modify anything in the system they run. - For example, a function that creates a file is not pure. - -These concepts can be applied in most popular programming languages. - -For example, in Python: - -``` -def hello(): - print("hello") - -def twice(f): - f() - f() -``` - -`twice` is a higher order function because it takes a function as an argument. -Functions are first-class functions because you can use `hello` as a value: - -``` ->>> twice(hello) -hello -hello -``` - -Similarly, you can write pure functions in almost any language. - -When you have first-class functions, you can define some higher-order functions that generalize some common code. -Three very common higher-order functions are: - -* Filter. - Filter applies a function to each element of a list, and returns a list composed of the elements for which the function returned true. -* Map. - Map applies a function to each element of a list, and returns a list of the result of the application of the function to each element. -* Fold. - A fold starts from an initial value, then calls a function with the initial value and the first element of the list. - Then it calls the function with the result of the previous call, and the next element of the list. - This continues until the list end, returning the last result of the function. - -(For example, folding with the sum operator and an initial value of 0, sums the elements of a list.) - -Note that you can implement many list manipulations by composing filters, maps, and folds with different functions. -(And by adding more higher-order functions, you can implement more list manipulations.) - -Also, you can manipulate other data structures with equivalent or other higher-order functions. - -Implementing code using higher-order functions and pure functions already has some interesting benefits. - -* Impure functions frequently require more mental overhead to understand, because you need to understand state. - With pure functions, you do not have to think about state. - -* To understand a program written as a composition of functions, you can start by understanding individual functions and then understand how they fit together. - The same program written as a sequence of statements is often more difficult to understand. - (However, sometimes the opposite effect occurs.) - -You can use these concepts in most popular programming languages. -(Most popular languages also provide higher-order functions such as filters, maps, and folds.) - -So you can get started with functional programming by using the programming languages you already know: - -* Try to write as much code as possible as pure functions. -* Learn which higher-order functions your programming language provides. -* Learn how to implement higher-order functions. -* Write code by composing pure functions with higher-order functions. - -## The consequences of first-class functions, higher-order functions, and pure functions - -Writing code using these concepts often leads to: - -* Writing cumbersome code if the programming language you use lacks certain features. -* Unlocking additional functional programming techniques. - -Therefore, many programming languages provide features that make functional programming more straightforward, or features enabled by functional programming. -Languages providing features related to functional programming are commonly named "functional programming languages". - -Although you can use functional programming with non-functional programming languages, this can often lead to: - -* Extra effort -* Not being able to use the full spectrum of functional programming features - -### The need for powerful type systems and type inference - -Higher-order functions often have complex type requirements. -For example, to filter a list of a given type, you must pass a function that takes a single argument of that type and returns a boolean. -If the arguments do not have the correct types, then the code does not work correctly. - -In languages with dynamic types, the program fails at runtime. -In languages with static types, you frequently must specify the types, and higher-order functions often require complex types involving different function types. - -Functional programming languages frequently: - -* Have static types, to prevent frequent runtime failures. -* Automatically infer types instead of requiring programmers to declare them. - (However, automatic type inference can cause issues in some scenarios, so frequently programming languages allow writing explicit types, or even require explicit types in some cases.) - -Because functional programs often use more complex types, functional programming languages often have more powerful type systems than non-functional programming languages. - -Derived from those properties, functional programming languages result in the "if it compiles, it works *correctly*" phenomenon. -This phenomenon helps avoid incorrect programs. - -## Functional programming languages - -### Haskell - -Functional programming practitioners often recommend Haskell as a functional programming language. - -[According to the Wikipedia](https://en.wikipedia.org/wiki/Haskell), "Haskell is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation". -Also, Haskell was designed by a committee whose purpose was "to consolidate existing functional languages into a common one to serve as a basis for future research in functional-language design". - -* Haskell is perhaps the language with more built-in functional programming features. -As mentioned, Haskell is used for research about functional programming, therefore many new concepts appear in Haskell first. -* Haskell is also very strict about functional programming, so Haskell drives programmers more strongly towards avoiding non-functional programming. -* Haskell syntax is designed so Haskell programs can be extremely terse and contain almost no extraneous syntax. -* Haskell is a very popular language, with a very large ecosystem. - You can take advantage of many existing libraries and tools for developing real-world programs faster. - -However, Haskell's benefits frequently also are negative for learning. - -* Haskell uses "lazy" evaluation, where most programming languages use "eager" evaluation. - Haskell does not evaluate expressions until needed (and might not evaluate some expressions). - Lazy evaluation can lead to efficiency. - However, lazy evaluation can cause unexpected performance problems. - [Foldr Foldl Foldl'](https://wiki.haskell.org/Foldr_Foldl_Foldl%27) explains how choosing incorrectly among different implementations of fold can lead to impactful performance problems. - When writing Haskell code for learning, you can likely stumble into issues not present in languages that use eager evaluation. - -* Haskell is very strict about purity. - To implement programs that have side effects, such as accessing files, you must use specific language features. - Many articles try to explain those features, because many people have trouble understanding them. - -* Many libraries and tools in the ecosystem take advantage of powerful features enabled by Haskell. - However, this might cause that using these libraries and tools require the understanding of the features they are based upon. - -Also, Haskell syntax is very terse, which leads to Haskell compilers not providing clear error messages. -For example: - -``` -$ ghci -> let sum a b = a + b -> sum 2 2 -4 -> sum 2 2 2 - -<interactive>:3:1: error: - • Non type-variable argument in the constraint: Num (t1 -> t2) - (Use FlexibleContexts to permit this) - • When checking the inferred type - it :: forall {t1} {t2}. (Num t1, Num (t1 -> t2)) => t2 -``` - -In complex programs, programmers new to Haskell might have trouble identifying that a function has been called with an extra argument from that error message. - -Personally, Haskell is my favorite functional programming language. -However, I learned Haskell after learning (with teachers and support from others) other functional programming languages. -I think that Haskell is ideal to learn the most powerful concepts in functional programming, but it is not as ideal as a first functional programming language. - -(Note that these recommendations come from someone who only has implemented about 50 Project Euler problems in Haskell, and has experimented on and off with the language, but not been paid for it.) - -### Lisp - -Many programmers like Lisp and languages in the Lisp family, such as Scheme or Clojure. -Lisp programmers often recommend Lisp to learn functional programming. - -Lisp is a very minimalistic, yet infinitely flexible language. -Lisp is extensible, so you can add most programming language features to Lisp, including functional programming features. - -Therefore, you can do functional programming in Lisp, and also benefit from all other Lisp features. - -However, languages in the Lisp family tend to not have static typing and associated features, thus do not frequently exhibit the "if it compiles, it works *correctly*" phenomenon. - -Lisp has one of the simplest syntaxes of any programming language. -The simple syntax of Lisp is directly tied to its power. -Many favor the Lisp syntax and argue that the syntax makes Lisp better for learning programming. -Personally, I find the Lisp syntax hard to read and write, and likely an additional difficulty on top of learning functional programming. - -I recommend learning Lisp because it is a unique programming language that can teach you many programming language concepts that are not present in many other languages. -However, I do not recommend Lisp for learning functional programming (unless you already know Lisp). - -(Note that these recommendations come from someone who has some formal training on Lisp but only uses Lisp infrequently [as a recent Emacs user].) - -### The ML family of programming languages - -ML is a language that appeared in 1973. -Since then, three dialects have become the most popular implementations of ML: - -* OCaml -* Standard ML -* F# (part of the .NET platform) - -Specifically, OCaml and F# have very strong ecosystems (OCaml because it is a popular and mature language, F# because as part of the .NET platform, it can use many .NET libraries and tools). - -Haskell is inspired by ML, but many of the Haskell features discussed above are not present in the ML languages: - -* MLs have eager evaluation, therefore avoiding the performance pitfalls of Haskell. -* MLs have simpler syntax, therefore frequently leading to clearer error messages. - -For example, compare the following snippet of OCaml to the previous error message example from Haskell: - -``` -$ utop # utop is a friendlier OCaml REPL -# let sum a b = a + b ;; -val sum : int -> int -> int = <fun> -# sum 2 2 ;; -- : int = 4 -# sum 2 2 2 ;; -Error: This function has type int -> int -> int - It is applied to too many arguments; maybe you forgot a `;'. -``` - -In my opinion, OCaml and F# are better languages for the initial learning of functional programming than Haskell. -After learning an ML, you are likely more prepared to learn Haskell and more sophisticated functional programming. - -(Note that those recommendations come from someone who only has experimented with OCaml and F#, and learned SML formally.) diff --git a/programming/the-content-web-manifesto/NOTES.org b/programming/the-content-web-manifesto/NOTES.org deleted file mode 100644 index b56c4bf1..00000000 --- a/programming/the-content-web-manifesto/NOTES.org +++ /dev/null @@ -1,30 +0,0 @@ -* [[README.md][README]] - -- https://www.fixbrowser.org/ - -** Document how terminal browsers can invoke a full browser to execute JavaScript - -See [[https://www.gnu.org/software/emacs/manual/html_node/eww/Advanced.html]], w3m has similar stuff. - -Also: - -- https://github.com/abhinavsingh/proxy.py Extensible Python proxy -- https://github.com/TempoWorks/txtdot -- https://github.com/4383/photonos -- https://sr.ht/%7Ebptato/chawan/ -- https://offpunk.net/ - -Browsers as a platform to manage content: - -- Just view the content as HTML with user-defined styling -- Archive all that we see so that we can locate content we have read easily, share with others, etc. -- RSS/Gemfeed/content subscription - -** Annotate URLs with another URLs - -- For example, add transcriptions to comic strips that do not have them -- The server pushes serialized bloom filters of annotated URLs (or entire annotation sets?) so that clients do not have to leak what they are browsing. -- Maybe https://dokie.li/ -- Alternative approach with Violentmonkey for accessibility purposes: [[https://github.com/alexpdp7/aelevenymonkey]]. - -** NoScript configuration merge diff --git a/programming/the-content-web-manifesto/README.md b/programming/the-content-web-manifesto/README.md deleted file mode 100644 index d05256b9..00000000 --- a/programming/the-content-web-manifesto/README.md +++ /dev/null @@ -1,52 +0,0 @@ -# The content web manifesto - -These are my recommendations for creating "content" websites. -In a content website visitors mostly read content. -Some example content websites are Wikipedia, news websites, and blogs. - -Also see [further notes](NOTES.org). - -## General guidelines - -### Test your website with a terminal browser without JavaScript like w3m, lynx, or elinks - -If your website is usable with one of those browsers, then: - -* Your website does not require JavaScript to load. - This automatically addresses most annoyances with content websites. - Websites that do not require JavaScript tend to require less resources, making them faster and lighter. - -* Your website does not rely on non-text content. - Text content is uniquely flexible, it is frequently the most amenable media to being processed by the following systems and processes: - - * Text-to-speech systems - * Translation (both human and automatic) - * Edition (making changes to text content) - * Quoting/embedding (readers can copy parts of your text to cite or promote your content) - - Images, audio, video or other interactive media might be required to convey the message of your content. - Therefore, the content web manifesto does not forbid their use. - However, non-text content should always be accompanied by at least a text description of the content, and ideally, an alternate text version of the content. - -* Your website will work with user styling. - Providing a visual style via CSS and others is fine, but users should be able to read your content with *their* choice of font, text size, color, and others. - This is important for accessibility, but also for everyone's comfort. - -And more importantly, this weakens browser monopolies controlling the web. -Not even massive companies like Microsoft dare to maintain a browser engine, leaving the web subject to the power of the very few browser vendors in existence. -But if your web content can be read under a terminal browser without Javascript, then your content is automatically accessible by a massive amount of browsers, including very simple ones. - -(Alternatively, use [the Gemini protocol](https://geminiprotocol.net/).) - -### Provide granular URLs - -When providing a significant amount of content, make sure readers can link to specific content of interest. - -This can be achieved by: - -* Splitting your content in different pages -* Providing HTML headers with anchors - -### Date content - -Always make initial publication and edition dates available. diff --git a/programming/the_tragedy_of_the_geeks.md b/programming/the_tragedy_of_the_geeks.md deleted file mode 100644 index c1193d5f..00000000 --- a/programming/the_tragedy_of_the_geeks.md +++ /dev/null @@ -1,75 +0,0 @@ -# The tragedy of the geeks - -Since the first computer entered our home, I was hooked. -This happened more than four decades ago, and continuously tinkering with computers has given me a well-paid and comfortable job. - -However, getting such jobs seems linked to spending a significant amount of your personal time practicing your skills. - -Many people seek careers related to computing because jobs have attractive conditions. -However, they might later regret the time and energies spent trying to get into the field when they learn that getting a good job requires unexpected effort. - -This document tries to explain to people who want to work with computers this phenomenon, to help them make a better decision. - -## Tinkering - -Working with computers is the only career I can think of where all of the following are true at the same time: - -* You can work on personal projects that are very similar to the projects you would do in a job. -* There is a reputation of abundant well-paid job offers with good conditions. -* Working on personal projects sounds fun. - -This means that many of us end up spending a significant amount of time working on personal projects. -This time investment increases our skills and the things we know. - -## Hiring - -Hiring is one of the most highly debated topics in this industry. - -Many people believe that many candidates cannot do the job. -There are many stories about new hires who cannot write simple programs. - -Whether this is common or not is not as important as whether people making decisions believe there are large differences between candidates. -When people who hire think that their hiring decision is going to have a large effect on them, then they want to make sure that they pick the right person. - -My perception is that most of the organizations that offer good job conditions (and many who do not) try to be very selective in hiring people. - -## Hiring tinkerers - -When you are hiring people, candidates who have spent significant time on personal projects tend to stand out over candidates who have not. - -This improved perception during the hiring process does not necessarily relate to improved performance on the job. -However, I believe that people who tinker on their spare time tend to land better jobs. - -## Handing out advice - -Because there are good jobs working with computers, many people think about making a career in the industry. - -There are many curriculums and formal education programs, from shorter (typically one year) to longer (four or five years). - -Some of them provide advice to land a good job, and students who follow programs who do not, tend to ask for advice. -In any case, one of the most frequent pieces of advice on the topic, is tinkering on your own time. - -I believe this is actually good advice, as in that it's more likely to be an efficient way to increase your prospects. - -However, remember that hiring is roughly a competitive process. -An organization evaluates a group of candidates, and tries to pick the best one. - -So to stand out, if more candidates tinker (because this is effective advice), the more you need to tinker to stand out. - -I cannot estimate how much you need to tinker on your own time to land a good job, but my guess is that it is more than what someone wanting to get into the field expects. - -As long as this dynamic continues, the tinkering required to land a good job will increase. -Only reduced competition can reduce the tinkering required, and reduced competition can happen by few factors, such as increased demand for workers, or a reduction in job seekers. - -## Breaking the cycle - -I cannot think of much that we can individually do to break the cycle. - -Maybe if people coming into the field are aware of this phenomenon, they will be able to make a better decision about what to do. - -If a sufficient amount of people decide that the time investment is not worthwhile, then perhaps the competition will decrease. -And if people are well informed and decide to move forward, at least they will be less likely to become frustrated or regret their decision. - -## Further reading - -* [A paean to programming](https://bertrandmeyer.com/2025/04/23/a-paean-to-programming/), by [Bertrand Meyer](https://en.wikipedia.org/wiki/Bertrand_Meyer) diff --git a/programming/we_cant_code_good_so_take_llm_hype_with_a_grain_of_salt.md b/programming/we_cant_code_good_so_take_llm_hype_with_a_grain_of_salt.md deleted file mode 100644 index 675f8a2e..00000000 --- a/programming/we_cant_code_good_so_take_llm_hype_with_a_grain_of_salt.md +++ /dev/null @@ -1,32 +0,0 @@ -# We can't code good (so take LLM hype with a grain of salt) - -Software engineering is a young field compared to most other fields. - -All studies about software engineering projects I have seen say that most projects are late, over budget, and underdeliver. - -(A great article did some statistics to estimate the percentage of software engineers who have never seen worked on a successful project. -The result was discouraging. -I have never found this article again, please contact me if you know it.) - -Consider CRUD applications, which most software engineers consider to be the lowest kind of work. -A massive amount of CRUD applications have been developed, and [I believe we are not effective at churning them out](crud_is_an_important_unsolved_problem.md). - -[What we know we don't know: empirical software engineering](https://www.hillelwayne.com/talks/ese/ddd/) is an interesting talk by Hillel Wayne that elaborates on this topic. - -The talk claims that researchers who try to prove that any engineering practice has a significant effect on delivery mostly fail. -Apparently, code review seems to be one of the few practices that has been proven to be effective but its effectivity is small compared to sleeping well, not being stressed, or working the right amount of hours. - -This includes practices that most of us believe have significant effects, such as using dynamic or static typing, abbreviating or not identifiers, and many others. - -We all believe that some practices are effective, yet this does not seem to be measurable. - -I think experience, good team work, smarts, and motivation help. -(And I believe there are certainly things that *hinder* software engineering!) -We have been able to deliver good works of engineering, and some things seem to work, but we can't code good yet. - -So *any* claim of *any* practice being effective should be taken with a dose of healthy skepticism. -I do not even know how to measure software engineering effectivity; I can use my experience to compare and try to infer conclusions, but I know my conclusions can be wrong. - -I am sure that at some point our discipline will be as mature as other disciplines, but think how much effort it took other disciplines to get there. -But until then, we should be humble and not make bold claims, and we should question the incentives of those who make bold claims. -Perhaps some of those bold claims turn out to be true, but the longer that you have been in this field, the more bold claims you will have seen fail. diff --git a/programming/when_to_code.md b/programming/when_to_code.md deleted file mode 100644 index c0bead9e..00000000 --- a/programming/when_to_code.md +++ /dev/null @@ -1,18 +0,0 @@ -# When to code - -Our job as programmers is to solve real-world problems in a satisfactory manner. -As we are coders, our first instinct is to jump in and write some code. -This is often a mistake, consider: - -* Coding is very expensive. -It takes a lot of time, not just coding, but gathering requirements from the business people, deploying it to a production environment, doing end-user testing, maintaining it... -* There is a massive amount of software out there- chances are, someone had the need before and paid someone to do it. -If it is a common problem, there is a chance that there is something that will fit exactly your needs (and if it doesn't fit exactly, you might need to rethink your requirements- someone went over the full process and probably you have not- they might have figured out something you have not). -Perhaps it's got a price tag on it, and maybe you think it's excessive, but if they are commercializing it and they have a few customers, consider that the cost of development will be divided among all customers- and you might not have this advantage. -* Even if there is no specific off-the-shelf software which solves your needs, some generic software such as an spreadsheet or a content management system, or some specific features such as a word processor's mail merge functionality might come close enough to your needs. -It is likely that at some point you will outgrow it (and software such as spreadsheets are specially amenable to being twisted and stretched terribly beyond their intended use case, resulting in terrible monsters which you should be wary of), but starting with them might help you frame the problem better than starting with code. - -Still, there are lots of uncharted territories- there are needs for which no decent off-the-shelf software exists, or you might have already examined all possibilities and they truly don't fit you. -Or maybe you want to create a competing product, or doing something truly innovative. -Or (this is a common scenario) you are not able to convince someone to use an off-the-shelf solution. -In that case, you will have to code. |
